tag:blogger.com,1999:blog-15626356.post115257374480002798..comments2012-10-17T10:09:12.903-07:00Comments on ecmanaut: Encoding / decoding UTF8 in javascriptJohan Sundströmhttp://www.blogger.com/profile/04076097346172610543noreply@blogger.comBlogger28125tag:blogger.com,1999:blog-15626356.post-75847244267385169982010-08-06T13:55:15.721-07:002010-08-06T13:55:15.721-07:00Thanks for your note on decodeURIComponent. I'...Thanks for your note on decodeURIComponent. I'm dealing with an XSLT template that includes something like following text <br /><br /><a href="javascript.alert('Désolé')>XXX</a><br /><br />Now, XSLT is required to produce UTF-8 for Désolé because it's in an anchor, so the alert function then gets an unescaped version of this. My final solution after reading your note: call the following function (LOL) instead:<br /><br />function alertDecodeURI(text) {<br /> alert(decodeURIComponent(escape(text))); // escape to get back URI encoding<br />}shdanfonoreply@blogger.comtag:blogger.com,1999:blog-15626356.post-65389801601201537282010-08-05T01:47:56.368-07:002010-08-05T01:47:56.368-07:00Thanks!
We had been sending UTF-8 data through jQ...Thanks!<br /><br />We had been sending UTF-8 data through jQuery which was dealing with it fine in all browsers, until we switched to running our app in a Webkit component embedded in a PyQt application. Still sending UTF-8 from Python but it goes through an implicit "eval()" call, and I was ending up with £ (capital A, circumflex accent, pound sterling symbol) instead of £ (pound sterling symbol).<br /><br />Popping the UTF-8 strings through the above has fixed this.<br /><br />This is the QtWebkit 4.6.2.0<br /><br />Win32, Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB) AppleWebKit/532.4 (KHTML, like Gecko) Qt/4.6.2 Safari/532.4<br /><br />Thanks againPaulhttp://www.google.com/profiles/117275094170439175752noreply@blogger.comtag:blogger.com,1999:blog-15626356.post-44581874248463684292010-07-13T22:10:11.360-07:002010-07-13T22:10:11.360-07:00Thanks for sharing
website design professional se...Thanks for sharing<br /><br /><a href="http://www.designerevaluation.com/" rel="nofollow">website design</a> <a href="http://www.seoprofessionalsonline.com" rel="nofollow">professional seo</a> <a href="http://www.seoprofessionalsonline.com/website-optimizer.asp" rel="nofollow">website optimizer</a> <a href="http://www.designerevaluation.com/logo-design/" rel="nofollow">logo design</a>Alicehttp://alice-thomas222.myopenid.com/noreply@blogger.comtag:blogger.com,1999:blog-15626356.post-29907225332074579972010-04-14T09:22:51.979-07:002010-04-14T09:22:51.979-07:00Still now in 2010, this is the only usable search ...Still now in 2010, this is the only usable search result on solving this specific problem. Have nobody else noticed and written about this?<br /><br />Also an UPDATE: Works in Chrome --<br />Win32, Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US) AppleWebKit/532.5 (KHTML, like Gecko) Chrome/4.1.249.1045 Safari/532.5Simonhttps://www.blogger.com/profile/08999351403559982314noreply@blogger.comtag:blogger.com,1999:blog-15626356.post-59023543414101222802010-03-18T00:11:32.392-07:002010-03-18T00:11:32.392-07:00For completeness, it would be nice if you extended...For completeness, it would be nice if you extended the article with a bit of code highlighting that with this, you can create safe urls as well, by percent encoding all the utf8 bits:<br /><br />function percent_encode( s )<br />{<br /> utf = encode_utf8( s );<br /> var enc = '';<br /> for( var i = 0; i < utf.length; i++ )<br /> {<br /> enc += '%' + utf.charCodeAt( i ).toString( 16 );<br /> }<br /> return enc;<br />}Unknownhttps://www.blogger.com/profile/13461418051535966555noreply@blogger.comtag:blogger.com,1999:blog-15626356.post-55451991881529491222010-02-05T17:58:47.852-08:002010-02-05T17:58:47.852-08:00Very good! Thank you very much. I've been havi...Very good! Thank you very much. I've been having problems with a facebook connect site where the facebook stream.publish api method was not recognizing the accented characters being sent through a javascript function. I couldn't find any suggestions anywhere until I came across this blog. I applied your suggestion and voilà! problem resolved!Anonymoushttps://www.blogger.com/profile/08124981585781818010noreply@blogger.comtag:blogger.com,1999:blog-15626356.post-47298175767452601452009-06-08T05:02:10.909-07:002009-06-08T05:02:10.909-07:00Best solution I've seen so far. The others usu...Best solution I've seen so far. The others usually are bitwise operations to get the UTF-8 byte sequences, but don't fully implement the encoding.bucabayhttp://bucabay.com/noreply@blogger.comtag:blogger.com,1999:blog-15626356.post-18834981387752540632009-06-05T12:07:45.155-07:002009-06-05T12:07:45.155-07:00Well, somewhere between your UTF-8 encoders and th...Well, somewhere between your UTF-8 encoders and this blog, something is going wrong at least, because „ and – and … (code points 8222, 8211 and 8230 respectively, all way beyond 255) are not 8-bit characters, which every octet in a valid UTF-8 string must be, by definition.<br /><br />Maybe you are sitting on some Windows system doing fancy quotes under your feet, or similar muck. Best of luck with your debugging.Johan Sundströmhttps://www.blogger.com/profile/04076097346172610543noreply@blogger.comtag:blogger.com,1999:blog-15626356.post-65848678822182180472009-06-05T08:19:35.800-07:002009-06-05T08:19:35.800-07:00Hmm, well... If RÄKSMÖRGÅS is encoded to utf8 with...Hmm, well... If RÄKSMÖRGÅS is encoded to utf8 with Javascript, it gets RÄKSMÖRGÃ…S. But if the same word is encoded to utf8 with Java or UltraEdit Text Editor (ASCII to UTF-8), it gets RÄKSMÖRGÃ…S.<br /><br />I am parsing an XML-document encoded as utf8 (the Java-version...) in Javascript.<br /><br />So, is there two versions of utf8?Connynoreply@blogger.comtag:blogger.com,1999:blog-15626356.post-82954110124857991752009-06-04T12:10:12.259-07:002009-06-04T12:10:12.259-07:00That is because the UTF-8 encoding of RÄKSMÖRGÅS i...That is because the UTF-8 encoding of RÄKSMÖRGÅS is RÃKSMÃRGà S, not RÄKSMÖRGÃ…S. (Any incorrect encoded input sequence will probably give you that error or a similar one.)Johan Sundströmhttps://www.blogger.com/profile/04076097346172610543noreply@blogger.comtag:blogger.com,1999:blog-15626356.post-68898681044996809892009-06-03T23:06:29.872-07:002009-06-03T23:06:29.872-07:00How about uppercase letters?
For me your example...How about uppercase letters? <br /><br />For me your example räksmörgÃ¥s is correctly decoded into räksmörgås but RÄKSMÖRGÃ…S gives a "malformed URI sequence" error.Connynoreply@blogger.comtag:blogger.com,1999:blog-15626356.post-43550376353763294962009-05-02T17:28:00.000-07:002009-05-02T17:28:00.000-07:00Result of your platform/browser:
Win32, Mozilla/5....Result of <B>your platform/browser</B>:<br />Win32, Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.0.8) Gecko/2009032609 Firefox/3.0.8 (.NET CLR 3.5.30729)<br /><br /><B>My $_SERVER['HTTP_USER_AGENT']</B>:<br />Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0.04506.648; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729) <br /><br />Remarks:<br />I am running internet explorer version 8.0, which is operating in some sort of compatability mode, and suddenly the executed javascript calls ajax-requests as xmlHttp=new XMLHttpRequest() when before it (iex 6 and 7)) would execute xmlHttp=new ActiveXObject("Msxml2.XMLHTTP") or xmlHttp=new ActiveXObject("Microsoft.XMLHTTP")<br /><br />take care!Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-15626356.post-18286047986159307402009-05-02T00:16:00.000-07:002009-05-02T00:16:00.000-07:00internet explorer 8:
Win32, Mozilla/4.0 (compatibl...internet explorer 8:<br />Win32, Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0.04506.648; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-15626356.post-5037136865428543222009-03-12T13:11:00.000-07:002009-03-12T13:11:00.000-07:00In that case you are doing it wrong; unescape(enco...In that case you are doing it wrong; <B>unescape(encodeURIComponent("€")) === "\xE2\x82\xAC"</B> and <B>decodeURIComponent(escape("\xE2\x82\xAC")) === "€"</B> both return true, as they they are supposed to.Johan Sundströmhttps://www.blogger.com/profile/04076097346172610543noreply@blogger.comtag:blogger.com,1999:blog-15626356.post-32399848973472062012009-03-12T07:20:00.000-07:002009-03-12T07:20:00.000-07:00Hi ,I believe that this method doesn't work for ch...Hi ,I believe that this method doesn't work for characters like the EURO symbol €. In Firefox I get Malformed URI sequence errorAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-15626356.post-53073152619247808222008-09-07T13:35:00.000-07:002008-09-07T13:35:00.000-07:00Same principles as in all programming apply: don't...Same principles as in all programming apply: don't decode UTF-8 that isn't; don't divide by zero, and so on. Just adding a try/catch will hide errors in input and is inadvisable.Johan Sundströmhttps://www.blogger.com/profile/04076097346172610543noreply@blogger.comtag:blogger.com,1999:blog-15626356.post-1415200809822174572008-09-07T12:47:00.000-07:002008-09-07T12:47:00.000-07:00Excellent stuff. I wanted to note that:decode_utf...Excellent stuff. I wanted to note that:<BR/>decode_utf8()<BR/>can throw an error. It is probably a good idea to wrap the call in a try...catch<BR/><BR/>I think I was getting this error when trying to decode UTF-8 when it was really ISO 8859.scotthttps://www.blogger.com/profile/07776513885232317947noreply@blogger.comtag:blogger.com,1999:blog-15626356.post-68627176337598891882008-03-16T01:22:00.000-07:002008-03-16T01:22:00.000-07:00Put it between a <script> and </script>...Put it between a <B><script></B> and <B></script></B> tag and you'll be fine.Johan Sundströmhttps://www.blogger.com/profile/04076097346172610543noreply@blogger.comtag:blogger.com,1999:blog-15626356.post-37620227822914076612008-03-15T13:10:00.000-07:002008-03-15T13:10:00.000-07:00Just one question... does it matter where you put ...Just one question... does it matter where you put that code? I'm completely new to javascript, so i don't have any idea.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-15626356.post-65128245020196595922008-02-26T03:55:00.000-08:002008-02-26T03:55:00.000-08:00Hi JohanThanks for the great little UTF-8 hack :) ...Hi Johan<BR/><BR/>Thanks for the great little UTF-8 hack :) I've used on my open source tool Hackvertor:-<BR/>http://www.businessinfo.co.uk/labs/hackvertor/hackvertor.php<BR/><BR/>btw this isn't comment spam, Johan gave me permission to plug my tool :)Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-15626356.post-18148874478864781532007-10-27T07:30:00.000-07:002007-10-27T07:30:00.000-07:00Got it, thanks.Got it, thanks.Rišihttps://www.blogger.com/profile/11385555398508163728noreply@blogger.comtag:blogger.com,1999:blog-15626356.post-6381867958175210872007-10-27T06:34:00.000-07:002007-10-27T06:34:00.000-07:00What did you expect? That is how the UTF8 encoded ...What did you expect? That is how the UTF8 encoded text is represented when the undecoded UTF8 message is seen as a normal eight-bit string, when your character set is ISO-8859-1, commonly referred to as Latin-1.Johan Sundströmhttps://www.blogger.com/profile/04076097346172610543noreply@blogger.comtag:blogger.com,1999:blog-15626356.post-16037803213322546852007-10-27T03:51:00.000-07:002007-10-27T03:51:00.000-07:00Thanks Johan.However, on my machine, this does not...Thanks Johan.<BR/>However, on my machine, this does not quite work: the word got encoded as<BR/>räksmörgÃ¥s<BR/><BR/>Win32, Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.8) Gecko/20071008 Firefox/2.0.0.8<BR/><BR/>Any idea why?Rišihttps://www.blogger.com/profile/11385555398508163728noreply@blogger.comtag:blogger.com,1999:blog-15626356.post-1173185334091023252007-03-06T04:48:00.000-08:002007-03-06T04:48:00.000-08:00Worked like a charm with:Linux Firefox/2.0.0.2 (Ub...Worked like a charm with:<BR/>Linux Firefox/2.0.0.2 (Ubuntu-edgy)Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-15626356.post-1172161857753267802007-02-22T08:30:00.000-08:002007-02-22T08:30:00.000-08:00Best solution, thanks.Best solution, thanks.Anonymousnoreply@blogger.com