- Unicode and UTF-8
- Unicode file reader
- A little Unicode editor
- i18n tester quickly shows what parts of Unicode are supported by your fonts
- dead keys for accents - a tiny package allowing easier entering of accented characters
http://www.cl.cam.ac.uk/~mgk25/ucs-fonts.html has iso10646 fonts that are useful to install on Unix.http://www.slovo.info/unifonts.htm links to free TrueType fonts for larger or smaller subsets of the Unicode. Two fonts for writing almost any Latin script based language, one of them is TimesRoman like  together with many articles about Unicode.
Until version 3.0, 16 bits (\u0000-\uFFFD: the "Basic Multilingual Plane", BMP) were sufficient for any Unicode. From 3.1, we must expect longer codes - up to 31 bits long, as specified in ISO 10646. Why 31 bits? Because that is the maximum that can be expressed in UTF-8: 6 bytes, omitting the taboo values \xFE and \xFF. (RS)
1111110a 10aaaaaa 10bbbbbb 10bbcccc 10ccccdd 10dddddd(small letters standing for "payload" bits of bytes a..d, highestmost has only 7 bits)
Character Sets And Code Pages At The Push Of A Button http://www.i18nguy.com/unicode/codepages.htmlLots of other good info at the same site's home page 
"ASCII, dammit"  goes the other way.
WJP The programs uni2ascii and ascii2uni  convert between Unicode and numerous types of ASCII escape, including the \uXXXX notation used in Tcl. uni2ascii can also convert Unicode characters to similar ASCII characters, e.g. "smart quotes" to ordinary quotes, letters with diacritics to the same letter without them, etc.
Unicode Explained  (book review)
LV What other fonts are useful for a system to have available to display fonts? Many of this wiki's pages look like damaged nonsense (for instance, displaying the fraction symbols, etc. rather than language characters). I'm using Windows XP, so I would have thought I would have the most popular fonts. Or, maybe it is some settings I need to make to Firefox 2.x? an example of the type of page I mean is (well, this morning - 2007 June 19 10:10 EDT), http://wiki.tcl.tk/34Jannis A good font to display most characters found in Unicode is the "DeJaVu"-font. http://dejavu.sf.net
From comp.lang.tcl, the following exchange occurred during April, 2008:
Newsgroups: comp.lang.tcl From: r_haer...@gmx.de Date: Sat, 26 Apr 2008 11:55:45 -0700 (PDT) Local: Sat, Apr 26 2008 2:55 pm Subject: unicode - get character representation from \uxxx notation Hello, to show my problem see the following example: > set tcl_patchLevel 8.5.3b1 > set str "n\u00E4mlich" nämlich > set c 0xE4 > set str "n\\u[format %04.4X $ch]mlich" n\u00E4mlich How do I get the \u00E4 in the character representation let's say iso8859-1 ? > encoding convertto iso8859-1 $str Newsgroups: comp.lang.tcl From: billpo...@alum.mit.edu Date: Sat, 26 Apr 2008 14:21:27 -0700 (PDT) Local: Sat, Apr 26 2008 5:21 pm Subject: Re: unicode - get character representation from \uxxx notation To convert the hex number expressed as a string 0x00e4 to a Unicode character, use: format "%c" 0x00e4 You can then use encoding convertto to convert this to another encoding, e.g.: encoding convertto iso8859-1 [format "%c" 0x00e4]
LV 2008 Jul 08 I've a request from a developer concerning whether Tcl is capable of handling characters larger than the Unicode BMP. His application was using tdom and it encountered the 𝒜 character, which is a script-A, unicode value 0x1D49C, which tdom reports it can't handle because it is limited to UTF-8 chars up to 3 bytes in length.What do Tcl programmers do to properly process the longer characters?Note this is in an enterprise setting. Finding a solution is critical in the publishing (web or print) arena.RS 2008-07-09: Unicode out of BMP (> U+FFFF) requires a deeper rework of Tcl and Tk: we'd need 32 bit chars and/or surrogate pairs. UTF-8 at least can deal with 31-bit Unicodes by principle.LV During July, 2008, there was some discussion in the TCT mailing list  (let's see how long that URL lasts...) about ways that the Tcl code itself could evolve to handle things better. But for right now, users have to face either dealing with their wide unicode via a different programming language in some way (whether converting wide characters to some other similar character, using some sort of macro representation, etc.)
i18n - writing for the world - Arts and crafts of Tcl-Tk programming - some random korean text