Ruslish

Richard Suchenwirth 2001-06-21 -- Ruslish is a new member of The Lish family of transliterations, mapping strings in lowly 7-bit ASCII to Unicodes of Cyrillic letters.

As the Cyrillic alphabet contains 32 (or 33) letters, both in upper- and lowercase, a 1:1 mapping to [A-Z] was not always possible. I used mostly intuitive equivalents, Q for tv'ordyj znak, H for m'axkiy znak, and for the still unmapped letters uniformly use a prefixed exclamation mark, so /zh/ = !Z, /ch/ = !C, /sh/ = !S, /shch/ = !T, e oborotnoe = !E, /yu/ = !U, /ya/ = !A, /yo/ = !O.

To test this, make sure you have Tcl/Tk 8.1 or better and at least one font offering cyrillics in Unicode, and say e.g.

 ruslish Moskva i Leningrad

Proc ruslish:helpw pops up a help window with instructions. Enjoy!

 #--------------------------------------------- Ruslish
 set ::i18n_ru {
    !C \u0427 !S \u0428 !T \u0429 Q \u042A  H \u042C !E \u042D
    !U \u042E  !A \u042F !O \u0401
    A \u0410 B \u0411 V \u0412 G \u0413 D \u0414 E \u0415 !Z \u0416 Z \u0417
    I \u0418 J \u0419 K \u041A L \u041B M \u041C N \u041D O \u041E P \u041F
    R \u0420 S \u0421 T \u0422 U \u0423 F \u0424 X \u0425 C \u0426 Y \u042B
    !c \u0447  !s \u0448 !t \u0449 q \u044A  h \u044C !e \u044D
    !u \u044E  !a \u044F  !o \u0451
    a \u0430 b \u0431 v \u0432 g \u0433 d \u0434 e \u0435 !z \u0436 z \u0437
    i \u0438 j \u0439 k \u043A l \u043B m \u043C n \u043D o \u043E p \u043F
    r \u0440 s \u0441 t \u0442 u \u0443 f \u0444 x \u0445 c \u0446 y \u044B
 }
 proc ruslish args {
        if {$args==""} {set args "!Eto po-russkij"}
        foreach {from to} $::i18n_ru {
                regsub -all $from $args $to args
        }
        set args
 }
 proc ruslish:helpw {} {
    destroy .rh; toplevel .rh
    wm title .rh "Ruslish help"
    regsub -all {([A-Za-z]) } [subst $::i18n_ru] {\1:} txt
    regsub -all " " $txt "  " txt
    set example "ruslish Moskva i Leningrad"
    regsub " a:" $txt " \na:" txt ;# line-break before lowercases
    pack [message .rh.msg -font {Times 12} -bg lightyellow -text \
            "Ruslish is a mapping of Latin from and to Cyrillic letters.\
            Most are intuitive. Tv.znak and M.znak are Q,H.\
            Excess letters are represented with leading ! Mapping:$txt
 Example: \"$example\" 
 produces \"[eval $example]\""]
 }

See also A tiny input manager for an example in direct keyboard mapping for Ruslish.


On Cyrillic encodings in practice, Victor Wagner wrote in comp.lang.tcl: It looks that each major OS have its own encoding.

  • OS/2 and DOS - cp866
  • Windows - cp1251
  • MacOS - MacCyrillic
  • Unix - koi8-r

Of course, there is also iso8859-5, but almost nobody uses it.


Temporarily here, until we get more content for a Things Russian page: Russian multiplication (http://mathworld.wolfram.com/RussianMultiplication.html ) done by halving, doubling, and addition:

 proc russmul {a b {res 0}} {
    if {$a%2}  {incr res $b}
    if {$a==1} {return $res}
    russmul [expr $a/2] [expr $b*2] $res
 } ;# RS
 % russmul 27 35
 945

In the event that someone else may find this useful, I have wrapped the ruslish commands into a little applet that transliterates in either direction. Many thanks to RS for the original work, of course!

 #!/bin/sh
 # If running in a UNIX shell, restart wish on the next line \
 exec wish "$0" ${1+"$@"}


 # the ruslish mapping, and its reverse auto-generated
 set ::i18n_ru {
     !C \u0427 !S \u0428 !T \u0429 Q \u042A  H \u042C !E \u042D !U \u042E  !A \u042F !O \u0401
     A \u0410 B \u0411 V \u0412 G \u0413 D \u0414 E \u0415 !Z \u0416 Z \u0417
     I \u0418 J \u0419 K \u041A L \u041B M \u041C N \u041D O \u041E P \u041F
     R \u0420 S \u0421 T \u0422 U \u0423 F \u0424 X \u0425 C \u0426 Y \u042B

     !c \u0447  !s \u0448 !t \u0449 q \u044A  h \u044C !e \u044D !u \u044E  !a \u044F  !o \u0451
     a \u0430 b \u0431 v \u0432 g \u0433 d \u0434 e \u0435 !z \u0436 z \u0437
     i \u0438 j \u0439 k \u043A l \u043B m \u043C n \u043D o \u043E p \u043F
     r \u0440 s \u0441 t \u0442 u \u0443 f \u0444 x \u0445 c \u0446 y \u044B
 }

 foreach { en cyr } $::i18n_ru { lappend ::i18n_en $cyr $en }



 # helpful ruslish procs, courtesy of RS
 proc ruslish:helpw {} {
   destroy .rh; toplevel .rh
   wm title .rh "Ruslish help"
   regsub -all {([A-Za-z]) } [subst $::i18n_ru] {\1:} txt
   regsub -all " " $txt "  " txt
   set example "ruslish Moskva i Leningrad"
   pack [message .rh.msg -font {Arial 12} -text \
     "Ruslish is a mapping of Latin from and to Cyrillic letters.\
     Most are intuitive. Tv.znak and M.znak are Q,H.\
     Excess letters are represented with leading !

   Mapping: $txt
 Example: \"$example\"
 produces \"[ruslish:cyr $example]\""]
 }


 # to cyrillic, and to latin
 proc ruslish:cyr stg { return [string map $::i18n_ru $stg] }
 proc ruslish:lat stg { return [string map $::i18n_en $stg] }



 # and to keep the output updated ..
 proc refresh {} {
   global to_which

   .out.t delete 1.0 end
   switch $to_which {
     "to rus" { .out.t insert 1.0 [ruslish:cyr [.in.t get 1.0 end]] }
     "to eng" { .out.t insert 1.0 [ruslish:lat [.in.t get 1.0 end]] }
   }
 }



 # a nice little window
 frame .help
 label .help.l -text "Enter some text:"
 tk_optionMenu .help.to to_which {to rus} {to eng}
 button .help.b -text "Map" -command ruslish:helpw

 frame .in
 text .in.t -width 40 -height 4 -relief sunken -yscrollcommand {.in.sb set}
 scrollbar .in.sb -orient vertical -command {.in.t yview}

 frame .out
 text .out.t -width 40 -height 4 -relief raised -font "Arial 12" -yscrollcommand {.out.sb set}
 scrollbar .out.sb -orient vertical -command {.out.t yview}


 pack .help -fill x
 pack .help.l .help.to .help.b -side left -anchor w
 pack .in .out -padx 3 -pady 3
 pack .in.sb .in.t .out.sb .out.t -fill both -expand true -side right



 # and set everything up ..
 wm title . "Ruslish Applet"
 bind activate_refresh <KeyPress> refresh
 bindtags .in.t [concat [bindtags .in.t] activate_refresh]
 focus .in.t

s.havelka (hat0 on tclchat) sep 16, 2007


i18n - writing for the world