2006-08-01

Unicode code point bookmarklet

This bookmarklet is for all of you who want to look up the unicode code point for some character you have encountered, or, the other way around, when you have a unicode code point you want the unicode character for.

I fairly often (at least several times per year, sometimes month) find myself in this situation, for some reason or other. Most of the time, it's that I never learned how to convince keyboards to deliver a particular character ("×", for instance), but took my time memorizing its code point, so I'd be able to just type it into my Emacs by typing Control-Q followed by its (octal) character code (327), regardless of whether I was at home, in Sweden, France, San Francisco, on a macintosh, unix or windows machine, or a VT100 terminal, or... ...yes, enough already; they get the point. And occasionally, it's some piece of random Unicode trivia that for some reason stuck, like that the roman numerals start at U+21B0. (Beg your pardon if I poisoned your mind there.)

Anyway, browser bookmarks are about as good and trusty as Emacsen, even if you sometimes have to look them up when in an unfriendly environment (unless you're already using Google browser sync or similar friendly tools to bring your browser environment with you wherever you go). The above bookmarklet is crafted to integrate nicely with Firefox's bookmark keywords, so you can stash it away somewhere deep in the bookmarks hierarchy to get lost in peace. Assuming you gave the bookmark the keyword "char", type "char 64" into your address bar when, say, you forgot where they hid the @ sign in the Swedish macintosh locale, and thus produce the wanted character anyway via a trip over the clipboard. I love the clipboard. (Just clicking the bookmarklet, or invoking it without a parameter, will ask for the character or character code instead.)

Just typing a number verbatim treats it as decimal, prefix it with a zero to mark it as octal (0327 for the multiplication sign), or 0x or U+ for hexadecimal, as in 0x216B for the roman numeral twelve, (yes, U+2160 isn't zero but , since the good Romans didn't feel much need for any zeroes).

And, of course, pasting the one-off 愛 猫 ♀ ❤ useful character into it gives you the brain bugs that prove oh, so useful when you're in a poor input locale with a mindful of numbers that hog your brain. I'm sure it happens all the time!
blog comments powered by Disqus