taiku goes multilingual

Richard Suchenwirth 2002 - Since 8.1, a GIF image with the Chinese characters tai-ku has been included with Tcl. As I've never seen anybody use it, even less name a software after it, I now dare to present taiku, a little multilingual editor (tai-ku (ª· \u592a\u9177) is supposed to mean "extremely cool", and especially Tcl ;-) which for the Chinese part I hacked together in a day. (15 years ago it took me several weeks, in TurboPascal on a Z80 box...)

WikiDbImage taiku.jpg

Another day, and Japanese facilities were added as well (but the pointer list tim_tbl_jp is still incomplete and buggy - I had to assign romaji from memory, and my Japanese isn't that good ;-) Please improve it if you can, and let me know! In Japanese mode, unresolved words are converted to hiragana signs (uppercase makes Katakana). On the same day, Greek (gr), Hebrew (iv - which of course goes right to left), Korean Hangul (kr) and Russian (ru) were also added, as comparatively easy exercises. Hangul are special in that they have an alphabet, but syllables are two- dimensionally constructed of letters, sort of like Kanji. Therefore Hangul are prepared in the "favorite" field, but no menu is used. Make sure to put dashes (or spaces) between Korean syllables.

The main task was to write a little "input manager" that maps Latin keystrokes from the keyboard to the corresponding target characters. Since pronunciation spelled in pinyin/romaji is mostly ambiguous, a menu of 10 characters is presented wherever suitable. You select one of the characters by typing the associated digit, or can scroll through the whole "character space" of GB2312-80/Level 1 (which offers you 3755 most frequent characters) with the Up/Down keys. Most frequent characters are offered in the yellow "favorite" field; select them by just typing a space. (You might want to edit the list of favorites to include yours; you can also assign sequences of several characters to an input ASCII sequence). In the code, all characters are addressed with their decimal GB2312-80 resp. JIS-X0208 number (because these encodings are sorted by pronunciation - hooray for Tcl's smooth i18n support also regarding encoding conversions!).

In alphabets (Greek/Hebrew/Russian), keystrokes are directly mapped. Check the "?/Favorites" menu item to see the mapping tables.

Toggling between ASCII and "tim" input is done with the <Esc> key for Japanese and Chinese, or by clicking into the text resp. entry widget. To switch languages, use the blue button at bottom left.

Printing is solved only for Windows 95/ME/2000, where the only program able to print Chinese was Internet Explorer. Hence, a temporary HTML file is created and opened in an IE window. You can adjust font size etc. there and let it print for you. (Printing Hebrew need reverting by bidi rendering, because IE is clever enough to implicitly revert Hebrew strings, which were explicitly reverted in taiku to look right on screen). Copy/Paste with Unicodes works inside the same app on all platforms, between Tk apps only on Win 2000. Works on Sun/Solaris (if fonts exist, so not for CJK), including local copy/paste, but pasting does not work across Reflection/Windows. And of course, on any box you'll need at least one font that contains Chinese and Japanese characters...

 package require Tk
 #------------------------------------- tim: Tcl input manager
 set tim(modes) {cn gr iv jp kr ru}
 proc tim {w args} {
    global tim
    array set opt [concat {
        -font {Times 12} -msg "" -to "" -mode cn} $args]
    set tim(to)   $opt(-to)
    set tim(menu) "  $opt(-msg)"
    set tim(mode) $opt(-mode)
    set f $opt(-font)
    set tim(w) [frame $w]
    button $w.mode -width 3 -font {Helvetica 8 bold} -textvar tim(mode)\
        -pady 0 -command tim_mode -borderwidth 0 -bg NavyBlue -fg white
    entry $w.entry    -width 10 -font $f -textvar tim(entry) -borderw 1
    label $w.favorite -width 10 -font $f -textvar tim(favorite)
    label $w.menu     -width 30 -font $f -textvar tim(menu) -anchor w
    pack $w.mode $w.entry $w.favorite -side left
    pack $w.menu -fill x -expand 1
    bind $w.entry <Up>         "tim_showMenu $w.menu - -10"
    bind $w.entry <Down>       "tim_showMenu $w.menu -  10"
    bind $w.entry <KeyRelease> [list tim_handle $w.menu %A %k]
    bind $w.entry <Return>     {$tim(to) insert insert \n}
    bind $w.entry <FocusIn>    [list $w.favorite config -bg yellow]
    bind $w.entry <FocusOut>   [list $w.favorite config -bg [$w cget -bg]]
    bind $w.entry <Escape>     {focus $tim(to)}
    after 100 [list bind $::tim(to)  <Escape> [list focus $w.entry]]
    focus $w.entry
    set w
 }
 proc tim_mode {} {
    global tim
    if {$tim(mode)=="iv"} {$tim(to) mark set insert l2r}
    set modes [concat $tim(modes) $tim(modes)] ;# simulated wrapping
    set pos [lsearch $modes $tim(mode)]
    set tim(mode) [lindex $modes [incr pos]]
    if {$tim(mode)=="iv"} {$tim(to) mark set l2r insert}
    set w $tim(to)
    foreach tag [bindtags $w] {
        if {[lsearch $tim(modes) $tag]<0} {lappend tags $tag}
    }
    if {$tim(mode)=="ru" || $tim(mode)=="gr" || $tim(mode)=="iv"} {
        set tags [concat $tim(mode) $tags]
        $tim(w).entry config -state disabled -bg grey
        set tim(menu) "  See ?/Favorites for current keyboard mapping"
        focus $w
    } else {
        $tim(w).entry config -state normal -bg white
        focus $tim(w).entry
        set tim(menu) "  Ready for pinyin/romaji input"
    }
    bindtags $w $tags
    set tim(entry) ""; set tim(favorite) ""
 }
 proc tim_handle {w key keycode} {
    if {$key=="" && $keycode!=8 || $key=="\x0D"} return
    global tim
    set mode $tim(mode)
    upvar #0 tim_tbl_$mode tbl
    upvar #0 tim_fav_$mode fav
    switch -regexp -- $key {
      {^ $} {
        if {$tim(favorite)!=""} {
            $tim(to) insert insert $tim(favorite)
        } else {$tim(to) insert insert " "}
      }
      {[-A-Za-z?]|^$} {
        set tim(menu) ""; set tim(favorite) ""
        if [info exist tbl($tim(entry))] {tim_showMenu $w $tbl($tim(entry))}
        if {$tim(mode)=="kr"} {
            set tim(favorite) [hanglish $tim(entry)]
        } elseif [info exist fav($tim(entry))] {
            set tim(favorite) [dgb2uc $fav($tim(entry))]
        } elseif {$tim(mode)=="jp"} {
            set tim(favorite) [a2kana $tim(entry)]        
        } else {set tim(favorite) [string index $tim(menu) 1]}
        return ;# omit the clearing at end of proc
      }
      {[0-9]} {
        if {$tim(entry)==$key} {$tim(to) insert insert $key}
        if {$key == 0} {set key 10} ;# 0 comes after 9 in menu,kbd
        set char [string index $tim(menu) [expr {$key*3-2}]]
        $tim(to) insert insert $char
        set stem [string range $tim(entry) 0 end-1]
        if {![info exist fav($stem)]} {set fav($stem) [uc2dgb $char]}
      }
      default {$tim(to) insert insert $key}
    }
    set tim(entry) ""; set tim(menu) ""; set tim(favorite) ""
 }
 proc tim_showMenu {w {from -} {increment 0}} {
    global tim
    if {$from!="-"} {
        set tim(dbcs) $from ;# explicit start point was set
    } else {set from [incr tim(dbcs) $increment]}
    set tim(menu) ""
    for {set i 1} {$i<=10} {incr i} {
       if {$from%100>94} {incr from 6}
       append tim(menu) [expr {$i%10}] [dgb2uc $from] " "
       incr from
    }
    append tim(menu) ($tim(dbcs))
 }
 proc dgb2uc dlist {
    set res ""
    global tim
    foreach d $dlist {
       set b1 [format %c [expr {$d/100+160}]]
       set b2 [format %c [expr {$d%100+160}]]
       append res [encoding convertfrom euc-$tim(mode) $b1$b2]
    }
    set res
 }
 proc uc2dgb uc {
    global tim
    set res ""
    foreach byte [split [encoding convertto euc-$tim(mode) $uc] ""] {
       append res [format %02d [expr {[scan $byte %c]-160}]]
    }
    set res
 }
 # cn: Chinese - Starting positions of Pinyin syllables in GB2312-80
 array set tim_tbl_cn {
 . 102 a 1601 ai 1603 an 1618 ang 1625 ao 1628
 ba 1637 bai 1655 ban 1663 bang 1678 bao 1690 bei 1713 ben 1728 beng 1732
 bi 1738 bian 1762 biao 1774 bie 1778 bin 1782 bing 1787 bo 1803 bu 1822
 ca 1833 cai 1834 can 1844 cang 1852 cao 1857 ce 1862 ceng 1867 cha 1869 chai 1881
 chan 1884 chang 1893 chao 1912 che 1921 chen 1927 cheng 1937 chi 1952 chong 1968 
 chou 1973 chu 1985 chuan 2008 chuang 2015 chui 2021 chun 2026 chuo 2033 ci 2035
 cong 2047 cou 2053 cu 2054 cuan 2058 cui 2061 cun 2069 cuo 2072
 da 2078 dai 2084 dan 2102 dang 2117 dao 2122 de 2134 deng 2137 di 2144 dian 2163
 diao 2179 die 2188 ding 2201 diu 2210 dong 2211 dou 2221 du 2228 duan 2243
 dui 2249 dun 2253 duo 2262 e 2274 en 2287 er 2288
 fa 2302 fan 2310 fang 2327 fei 2338 fen 2350 feng 2365 fo 2380 fou 2381 fu 2382
 ga 2433 gan 2441 gang 2452 gao 2461 ge 2471 gei 2488 gen 2489 geng 2491 gong 2504
 gou 2519  gu 2528 gua 2546 guai 2552 guan 2555 guang 2566 gui 2569 gun 2585
 guo 2588 ha 2594 hai 2601 han 2608 hang 2628 hao 2630 he 2639 hei 2657 hen 2659
 heng 2663 hong 2668 hou 2677 hu 2684 hua 2708 huai 2717 huan 2722 huang 2736
 hui 2750 hun 2771 huo 2778 ji 2787 jia 2846 jian 2863 jiang 2909 jiao 2922
 jie 2950 jin 2977 jing 3005 jiong 3028 jiu 3030 ju 3047 juan 3072 jue 3085 jun 3089
 ka 3106 kai 3110 kan 3115 kang 3121 kao 3128 ke 3132 ken 3147 kong 3113 kou 3157
 ku 3161 kua 3169 kuai 3173 kuan 3177 kuang 3179 kui 3187 kun 3204 kuo 3208
 la 3212 lai 3219 lan 3221 lang 3237 lao 3244 le 3253 leng 3266 li 3269 lia 3309
 lian 3310 liang 3324 liao 3335 lie 3348 lin 3353 ling 3364 liu 3379
 long 3390 lou 3405 lu 3411 l?3432 lua 3445 lun 3453 luo 3460
 ma 3472 mai 3481 man 3487 mang 3502 mao 3508 me 3520 mei 3521 men 3537 meng 3540
 mi 3548 mian 3562 miao 3571 mie 3580 min 3581 ming 3587 mo 3594 mou 3617
 mu 3620 na 3635 nai 3641 nan 3647 nao 3652 ne 3656 neng 3660 ni 3661 nian 3672
 niang 3679 niao 3681 nie 3683 nin 3690 ning 3691 niu 3703 nong 3707 nu 3711
 n?3714 nuan 3715 nuo 3721 o 3722 ou 3723
 pa 3730 pai 3736 pan 3742 pang 3750 pao 3756 pe 3762 pei 3771 peng 3773 pi 3787
 pian 3810 piao 3814 pin 3820 ping 3825 po 3834 pu 3843
 qi 3858 qia 3901 qian 3903 qiang 3925 qiao 3933 qie 3948 qin 3953 qing 3964
 qiong 3978 qiu 3979 qu 3987 quan 4006 que 4017 qun 4025
 ran 4027 rang 4031 rao 4036 re 4039 ren 4041 reng 4051 ri 4053 rong 4056
 rou 4064 ru 4067 ruan 4077 rui 4080 ruo 4084
 sa 4086 sai 4089 san 4093 sang 4103 sao 4106 se 4110 sen 4113 sha 4115 shan 4127
 shang 4142 shao 4150 she 4163 shen 4173 sheng 4189 shi 4206 shou 4253 shu 4263
 shua 4302 shuan 4308 shuang 4310 shui 4313 shun 4318 shuo 4321 si 4325 song 4341
 sou 4349 su 4351 suan 4365 sui 4368 sun 4378 suo 4385
 ta 4390 tai 4405 tan 4414 tang 4432 tao 4445 te 4456 teng 4457 ti 4461 tian 4476
 tiao 4484 tie 4489 ting 4492 tong 4508 tou 4521 tu 4525 tuan 4536 tui 4538
 tun 4544 tuo 4547
 wa 4558 wai 4564 wan 4567 wang 4584 wei 4594 wen 4633 weng 4643 wo 4646 wu 4655
 xi 4684 xia 4725 xian 4738 xiang 4764 xiao 4784 xie 4808 xin 4829
 xing 4839 xiong 4854 xiu 4861 xu 4870 xuan 4889 xue 4905 xun 4911
 ya 4925 yan 4941 yang 4974 yao 4991 ye 5012 yi 5027 yin 5080 ying 5102
 yong 5120 you 5136 yu 5156 yuan 5206 yue 5227 yun 5237
 za 5251 zai 5252 zan 5259 zang 5263 zao 5266 ze 5280 zen 5285 zeng 5286
 zha 5290 zhai 5310 zhan 5319 zhang 5333 zhao 5348 zhe 5358 zhen 5368
 zheng 5384 zhi 5405 zhong 5448 zhou 5459 zhu 5473 zhua 5505 zhuan 5508
 zhuang 5514 zhui 5521  zhun 5528 zhuo 5529 zi 5540 zong 5555 zou 5562
 zu 5566 zua 5574 zui 5576 zun 5580 zuo 5582
 }
 #-- Chinese Favorites - frequent characters or words that are proposed early
 array set tim_fav_cn {
 b 1827
 d 2136
 g 2486 guo 2590
 h 2645 hai 2725
 i 5027 ig {5027 2486}
 ji 2816 jintian {2981 4476}
 l 3343
 m 3539
 n 3667
 r 4043
 s 4239 shenme {4218 3520}
 t 4391
 w 4650 wenti {4642 4466}
 y 5148
 z 5258 zhongguo {5448 2590} zh 5366 zhe 5537
 . 103
 }
 #-------------------------------------------- gr: Greek
 array set tim_fav_gr {
          A \u0391 B \u0392 G \u0393 D \u0394 E \u0395 Z \u0396
          H \u0397 Q \u0398 I \u0399 K \u039A L \u039b M \u039c
          N \u039d J \u039e O \u039f P \u03a0 R \u03a1 S \u03a3
          T \u03a4 U \u03a5 F \u03a6 X \u03a7 Y \u03a8 W \u03a9
          'A \u0386 'E \u0388 'H \u0389 'I \u038a 'O \u038c 'Y \u038e 'W \u038f
          a \u03b1 b \u03b2 g \u03b3 d \u03b4 e \u03b5 z \u03b6
          h \u03b7 q \u03b8 i \u03b9 k \u03bA l \u03bb m \u03bc
          n \u03bd j \u03be o \u03bf p \u03c0 r \u03c1 c \u03c2 s \u03c3
          t \u03c4 u \u03c5 f \u03c6 x \u03c7 y \u03c8 w \u03c9
          'a \u03ac 'e \u03ad 'h \u03ae 'i \u03af 'o \u03cc 'u \u03cd 'w \u03ce
          ' "{}" '' '
 }           
 foreach {in out} [array get tim_fav_gr]} {
    bind gr $in "%W insert insert $out; break"
 }
 #-------------------------------------------- iv: Hebrew
 array set tim_fav_iv {
    a \u5d0 b \u5d1 g \u5d2 d \u5d3 h \u5d4 w \u5d5 z \u5d6 x \u5d7 u \u5d8
    i \u5d9 K \u5da k \u5db l \u5dc M \u5dd m \u5de N \u5df n \u5e0 o \u5e1
    e \u5e2 F \u5e3 f \u5e4 Y \u5e5 y \u5e6 q \u5e7 r \u5e8 s \u5e9 t \u5ea
    ( ) ) (
 }
 foreach {in out} [array get tim_fav_iv]} {
    bind iv $in "%W insert insert $out;%W mark set insert {insert -1 char};break"
 }
 # ------------------------------------------ jp: Japanese
 array set tim_tbl_jp {
    . 102 a 1601 ai 1605 an 1634 i 1642 ichi 1676 in 1684 u 1706 un 1730 ei 1736
    eki 1756 en 1763 o 1787 on 1824 ka 1828 kai 1880 gai 1916 ko 1938
    kan 2005 ki 2076 kyu 2168 kyo 2194 kin 2250 ku 2269 kun 2315 getsu 2378
    ken 2379 gen 2421 ko 2435 go 2462 kyo 2482 koku 2581 kon 2603 sa 2619
    sai 2638 san 2716 shi 2737 shu 2871 ju 2926 jo 2988 jou 3069 shin 3113
    jin 3145 sui 3169 sun 3203 se 3268 sen 3271 son 3424 ta 3430 tai 3432
    dai 3469 tan 3516 chi 3544 chu 3570 machi 3614 tei 3666 tetsu 3720
    ten 3721 den 3737 to 3738 tou 3761 nai 3864 ni 3883 niku 3889 nichi 3892
    nen 3915 no 3921 ha 3935 pai 3976 han 4028 bi 4083 hyaku 4120 pin 4142
    fu 4152 bun 4212 hen 4248 hou 4279 hoku 4345 hon 4359 ma 4364 mai 4368 man 4392
    min 4417 mu 4421 me 4460 mo 4446 mon 4468 ya 4474 yu 4491 yo 4535
    rai 4572 ran 4580 ri 4588 ryu 4613 rei 4665 reki 4681 ren 4687 ro 4704
    roku 4727 ron 4732 wa 4733 wan 4748
 }
 array set tim_fav_jp {
    . 103
    nihon {3892 4360} no 446
    kuruma 2854
    to 440 tokyo {3776 2194}
    wa 447 watashi 2768
 }
 proc a2kana s {
    string map {
      cha \u3061\u3083 chu \u3061\u3085 cho \u3061\u3087 sha \u3057\u3083
      shu \u3057\u3085 sho \u3057\u3087 kya \u304d\u3083 kyu \u304d\u3085
      kyo \u304d\u3087 rya \u308a\u3083 ryu \u308a\u3085 ryo \u308a\u3087
      pya \u3074\u3083 pyu \u3074\u3085 pyo \u3074\u3087
      ka \u304b ga \u304c ki \u304d gi \u304e ku \u304f gu \u3050 ke \u3051
      ge \u3052 ko \u3053 go \u3054 sa \u3055 za \u3056 shi \u3057 ji \u3058
      su \u3059 zu \u305a se \u305b ze \u305c so \u305d zo \u305e
      ta \u305f da \u3060 chi \u3061 di \u3062 tsu \u3064 dsu \u3065
      te \u3066 de \u3067 to \u3068 do \u3069 na \u306a ni \u306b nu \u306c
      ne \u306d no \u306e ha \u306f ba \u3070 pa \u3071 hi \u3072 bi \u3073
      pi \u3074 fu \u3075 bu \u3076 pu \u3077 he \u3078 be \u3079 pe \u307a
      ho \u307b bo \u307c po \u307d ma \u307e mi \u307f mu \u3080 me \u3081
      mo \u3082 ya \u3084 yu \u3086 yo \u3088 ra \u3089 ri \u308a ru \u308b
      re \u308c ro \u308d wa \u308f wo \u3092 n \u3093  a \u3042 i \u3044
      u \u3046 e \u3048 o \u304a  k \u3063 p \u3063 t \u3063
      CHA \u3061\u30e3 CHU \u3061\u30e5 CHO \u3061\u30e7 SHA \u30b7\u30e3
      SHU \u30b7\u30e5 SHO \u30b7\u30e7 KYA \u30ad\u30e3 KYU \u30ad\u30e5
      KYO \u30ad\u30e7 RYA \u30ea\u30e3 RYU \u30ea\u30e5 RYO \u30ea\u30e7
      PYA \u30d4\u30e3 PYU \u30d4\u30e5 PYO \u30d4\u30e7
      KA \u30ab GA \u30ac KI \u30ad GI \u30ae KU \u30af GU  \u30b0 KE \u30b1
      GE \u30b2 KO \u30b3 GO \u30b4 SA \u30b5 ZA \u30b6 SHI \u30b7 JI \u30b8
      SU \u30b9 ZU \u30ba SE \u30bb ZE \u30bc SO \u30bd ZO \u30be
      TA \u30bf DA \u30c0 CHI \u30c1 DI \u30c2 TSU \u30c4 DSU \u30c5
      TE \u30c6 DE \u30c7 TO \u30c8 DO \u30c9 NA \u30ca NI \u30cb NU \u30cc
      NE \u30cd NO \u30ce HA \u30cf BA \u30d0 PA \u30d1 HI \u30d2 BI \u30d3
      PI \u30d4 FU \u30d5 BU \u30d6 PU \u30d7 HE \u30d8 BE \u30d9 PE \u30da
      HO \u30db BO \u30dc PO \u30dd MA \u30de MI \u30df MU \u30e0 ME \u30e1
      MO \u30e2 YA \u30e4 YU \u30e6 YO \u30e8 RA \u30e9 RI \u30ea RU \u30eb
      RE \u30ec RO \u30ed WA \u30ef WO \u30f2 N \u30f3  A \u30a2 I \u30a4
      U \u30a6 E \u30a8 O \u30aa  K \u30c3 P \u30c3 T \u30c3 - \u30fc
    } $s
 }
 #------------------------------------- kr: Korean Hangul
 proc hanglish2uc hanglish {
    # convert a Hanglish string to one Unicode 2.0 Hangul if possible
    set L ""; set V "" ;# in case regexp doesn't hit
    set hanglish [string map {
        NG Q YE X YAI F AI R YA V YO Y YU Z VI F
    } [string toupper $hanglish]]
    regexp {^([GNDLMBSQJCKTPH]+)?([ARVFEIXOYUZW]+)([GNDLMBSQJCKTPH]*)$} \
            $hanglish ->  L V T ;# lead cons.-vowel-trail cons.
        if {$L==""} {set L Q}
    if {$V==""} {return $hanglish}
    set l [lsearch {G GG N D DD L M B BB S SS Q J JJ C K T P H} $L]
    set v [lsearch {A R V F E EI X XI O OA OR OI Y U UE UEI UI Z W WI I} $V]
    set t [lsearch {"" G GG GS N NJ NH D L LG LM LB LS LT LP LH  \
            M B BS S SS Q J C K T P H} $T] ;# trailing consonants
    if {[min $l $v $t] < 0} {return $hanglish}
    format %c [expr {$l*21*28 + $v*28 + $t + 0xAC00}]
 }
 proc min args {lindex [lsort -real $args] 0}
 proc hanglish args {
    set res ""
    foreach i $args {
        foreach j [split $i -] {append res [hanglish2uc $j]}
    }
    append res " "
 }
 foreach i {
    ga ggai nya yai dde lei mye byei bbo soa ssoai oi jyo jju cue
    kuei tui pyu hw wi di
 } {set tim_fav_kr($i) [hanglish2uc $i]}
 #-------------------------------------- ru: Cyrillic (Russian)
 array set tim_fav_ru {
                    ! "{}" !! !
          A \u0410 B \u0411 V \u0412 G \u0413 D \u0414 E \u0415
          J \u0416 Z \u0417 I \u0418 J \u0419 K \u041A L \u041b
          M \u041c N \u041d O \u041e P \u041f R \u0420 S \u0421
          T \u0422 U \u0423 F \u0424 X \u0425 C \u0426 !C \u0427
          !S \u0428 W \u0429 Q \u042a Y \u042b H \u042c !E \u042d
          !U \u042e !A \u042F
          a \u0430 b \u0431 v \u0432 g \u0433 d \u0434 e \u0435
          j \u0436 z \u0437 i \u0438 j \u0439 k \u043a l \u043b
          m \u043c n \u043d o \u043e p \u043f r \u0440 s \u0441
          t \u0442 u \u0443 f \u0444 x \u0445 c \u0446 !c \u0447
          !s \u0448 w \u0449 q \u044a y \u044b h \u044c !e \u044d
          !u \u044e !a \u044f
 }           
 foreach {in out} [array get tim_fav_ru]} {
    bind ru $in "%W insert insert $out; break"
 }
 # end of tim
 #-------------------------------------- Editor routines
 proc htm_print {w} {
     # this works only on Windows 95..ME... 
     set filename [file join $::env(TEMP) taiku.html]
     set fp [open $filename w]
     puts $fp [s2html [$w get 1.0 end]]
     close $fp
     exec $::env(COMSPEC) /c start [file nativename $filename] &
 }
 proc s2html s {
     set res ""
     foreach line [split $s \n] {
        foreach c [split $line ""] {
            scan $c %c uc
            append res [expr {$uc>127? "&#$uc;" : $c}] 
        }
        append res <br>\n
     }
     set res
 }
 proc file:open {w {fn ""}} {
    if {$fn==""} {set fn [tk_getOpenFile]}
    if [string length $fn] {
        $w delete 1.0 end
        set f [open $fn]
        fconfigure $f -encoding utf-8
        $w insert end [read $f]
        close $f
        wm title . "$fn - taiku"
    }
 }
 proc file:save {w select} {
    if {$select || ![string match "* - *" [wm title .]]} {
       set fn [tk_getSaveFile]
    } else {regexp {(.+) - taiku} [wm title .] -> fn}
    if [string length $fn] {
       set f [open $fn w]
       fconfigure $f -encoding utf-8
       puts -nonewline $f [$w get 1.0 end-1c]
       close $f
       wm title . "$fn - taiku" ;# might have been saved under another name
    }
 }
 proc taiku:about {} {
    destroy .about; toplevel .about
    wm title .about "About taiku $::version"
    if {[lsearch [image names] taikuLogo]<0} {
        image create photo taikuLogo -file $::tk_library/images/tai-ku.gif
    }
    label .about.logo -image taikuLogo
    message .about.msg -aspect 1000 -text {\
    Multilingual editor, Richard Suchenwirth 2002

    Switch language with: blue button at bottom left
    Pure ASCII can be entered to text in cn/jp mode
    Switch between ASCII/Pinyin input with: <Esc>
    Type pinyin/hiragana in lowercase, KATAKANA upcase
    Select favorite (yellow field) with: <Space>
    Select character from menu with: digit key 1..9, 0
    Scroll the character menu with: Cursor Up/Down
    Enjoy: the power of Tcl/Tk!}
    pack .about.logo .about.msg -side left
 }
 proc taiku:favorites {} {
    set mode $::tim(mode)
    set alphabet [expr {[lsearch {gr iv kr ru} $mode] >= 0}]
    upvar #0 tim_fav_$mode fav
    destroy .favorites; toplevel .favorites
    set msg ""; set last ""; set newline ""
    foreach i [lsort [array names fav]] {
        append msg [expr {[string index $i 0]==$last? " ": $newline}]
        append msg [expr {$alphabet? "$i:$fav($i)": "$i [dgb2uc $fav($i)]"}]
        set last [string index $i 0]
        set newline [expr {$alphabet? " ": "\n"}]
    }
    pack [message .favorites.msg -aspect 500 -text $msg -font {Times 12}]
 }
 #-------------------------- demo and test
 if {[file tail [info script]]==[file tail $argv0]} {
   set version 0.2
   set msg "Welcome to taiku $version - \u6B22\u8FCE\u7528\u592A\u9177!"
   . config -menu [menu .menu]
   .menu add cascade -label File -menu [menu .menu.file -tearoff 0]
   .menu.file add command -label Open...   -command {file:open .t}
   .menu.file add command -label Print     -command {htm_print .t}
   .menu.file add command -label "Save"    -command {file:save .t 0}
   .menu.file add command -label "Save as..." -command {file:save .t 1}
   .menu.file add separator
   .menu.file add command -label Exit      -command exit
   .menu add cascade -label ? -menu [menu .menu.help -tearoff 0]
   .menu.help add command -label Favorites -command taiku:favorites
   .menu.help add separator
   .menu.help add command -label About...  -command taiku:about
   pack [tim .tim -to .t -msg $msg] -side bottom -fill x
   pack [text .t  -font {Times 12}] -fill both -expand 1
   raise .
   if {[llength $argv]==1 && $argv!="{}"} {file:open .t [lindex $argv 0]}
   bind .t <1> {set tim(menu) "line.char [.t index @%x,%y]"}
   bind . <Control-r> {exec [info nameofexecutable] $argv0 &; exit}
 }

See also iKu for taiku's baby brother on the PocketPC


SB 2004-10-23: Has anybody using AquaTcl successfully got the japanese mode to work? For some reason I do not get the glyphs representing hiragana and katakana properly. I have checked with the special character input facility in macosx that the unicodes that are listed in proc a2kana fit, but I get only empty squares as representation. They come from a different family than times (Hiragino Kaku Gothic Pro). I tried to change the font family without luck. Any suggestion for OSX users?

Lars H: Do other non-latin alphabets work for you? For quite some time there has been a bug in AquaTk that made pretty much all non-latin (non-latin1 or non-macRoman or something like that; I don't remember exactly what anymore) render as "missing character box". (Checking.) Sadly, I still see this problem in 8.4.7 (although looking a bit different than it used to).

VK: I dare to point to a mistake in the Russian text, it should be something like \u042d\u0442\u043e\u0020\u043f\u043e\u002d\u0440\u0443\u0441\u0441\u043a\u0438 - RS agrees - thanks for pointing these two bugs out! I'm only a linguist, but that doesn't mean I really know Russian :( Will put up a corrected screenshot tomorrow...

Dotan: I'm only realy starting with tcl (and I'm not a programer anyway) and this is realy cool. I played with the Hebrew support and it seem to have it's unique mapping so [L1 ] has a Hebrew keyboard layout chart (this is one of three different layouts used for Hebrew but anyway the letters are always in the same place and it's only the special chars that changes). Hebce the mapping should be something like

 #-------------------------------------------- iv: Hebrew

array set tim_fav_iv {

    t \u5d0 c \u5d1 d \u5d2 s \u5d3 v \u5d4 u \u5d5 z \u5d6 j \u5d7 y \u5d8 h \u5d9
    l \u5da f \u5db k \u5dc o \u5dd n \u5de i \u5df b \u5e0 x \u5e1 g \u5e2 ; \u5e3 
    p \u5e4 . \u5e5 m \u5e6 e \u5e7 r \u5e8 a \u5e9 , \u5ea 
    ( ) ) (
 }

I only keep getting "}" after a single lettem Mem