An MS Word-like AutoCorrect feature implementation.

LES on June, 2003: Here is a story you don't hear every day. You know that AutoCorrect feature in MS Word? You type 'tihs' and Word replaces it with 'this'. I really love that feature. Not so much to correct my typos, but as a shorthand tool. I type 'ill go,w or wo u' and it automatically expands to 'I'll go, with or without you'. The part of the story that you don't hear every day is that I am so addicted to it and have used it for so long that I've built up a 10,000-entry list. Writing in Word, I shorthand all the time.

So I tried to implement it in a Tk app. I load the entire list from a SQLite database into a Tcl array at startup, and every time I hit space or punctuation, the app searches for the last "word" just typed in the array, deletes that word and prints its counterpart.

I still need to implement automatic capital letter in the beginning of sentences and some method to undo the auto correction. Ctrl+Z will not yield the expected(?) result. Apart from that, it is a perfect Auto Correct feature, ready to be implemented in any Tcl/Tk-based text editor.

Thanks to Michael A. Cleverly for very useful hints.

 # First, you must have this text widget: $w.textframe.texto1

 # Now, the binding. We want to launch 'AutoCorrect' every time we hit the space
 # and/or punctuation keys. These four lines will launch the function whenever
 # the last key pressed (%K) is found in the 'myACkeys' list:

 set myACkeys   {space period comma colon semicolon question exclam slash backslash less greater equal asterisk plus minus parenleft parenright bracketleft bracketright braceleft braceright quotedbl quoteright}
 foreach key $myACkeys  {
        bind $w.textframe.texto1 <$key> { autocorrect }

 # Note: these bindings were obtained in Windows. They may vary in other platforms.

 # I have a database with two columns: 'type' and 'replace'. Let's load them into
 # an array called 'myAClist'

        set myQuery {select type,replace from autocorrect}
        sq eval $myQuery {}     { array set myAClist [ list $type $replace ] }

 # Now the autocorrect proc

 proc   autocorrect     {}      {
 global w myAClist

 # get the 40 last characters every time you type, like a "trail"
 # 40 is, of course, an arbitrary limit
        set myTrail [ $w.textframe.texto1 get "insert -40c" insert ]
 # myTypeString is a regular expression to get the last word
 # its value may change along the script
        set myTypeString {[^,.;: ]+}

 # The loop. Here is what this loop does:
 # Get the last word in the "trail". If the last word ('type' string) is found,
 # replace it with the 'replace' counterpart. If it is not found, get the TWO
 # last words and search for the two-word string in the array. If it is not found,
 # get the THREE last words and search again. It can go on forever, but I 
 # set the limit to 10 words. More than that is very little likely to be used and
 # might make everything run too slow. My actual application uses only 7.
 # If it is not clear, the purpose of having multi-word 'type' strings it to 
 # type, say "south america" and have it corrected to "South America".
        for     { set myIteration 1 } { $myIteration <= 10 } { incr myIteration }       {
                regexp -line "($myTypeString)\$" $myTrail => myLastWord
                set myLastWordWipeSize [ string length $myLastWord ]
 # Note that at this time, the RE $myTypeString is [^,.;: ]+
                if      { [ array get myAClist $myLastWord ] != "" }            {
                        $w.textframe.texto1 delete "insert -$myLastWordWipeSize c" insert
                        $w.textframe.texto1 insert insert "$myAClist($myLastWord)" 
 # If the 'type' string is found, that's enough, so break the loop

 # What if what I just typed is a 'type' string, but I typed it in CAPITALS? It won't
 # be found in the array. Not unless we repeat the previous operation, but slightly
 # different:
                if      { [ array get myAClist [ string tolower $myLastWord ] ]  != "" }                {
                        $w.textframe.texto1 delete "insert -$myLastWordWipeSize c" insert
                        $w.textframe.texto1 insert insert [ string toupper $myAClist([ string tolower $myLastWord ]) ] 
 # Good. Now, if we type 'TIHS', it will be replaced with 'THIS'.

 # Our single-word 'type' string is not found in the array. Now what? The loop
 # will be run again, of course. This time, let's look for [^,.;: ]+ [^,.;: ]+ 
 # i.e. the last two words. If it's not found either, the next iteration will look for 
 # [^,.;: ]+ [^,.;: ]+ [^,.;: ]+  etc.
                set myTypeString "$myTypeString $myTypeString"
 # if the 10 last words don't match anything, we stop searching, of course


One final comment: isn't it great to read a script whose variables' names make it dead clear what they represent, instead of foreach i $a { set c "[ lindex z end ] is not $i, but could be $b" }

In case you're wondering, the 'my' prefix gives the variables the right color in my syntax highlighting scheme, even if they do not have $ .

Just a guess, but I believe that if you slightly alter your insertion method to use event generate it will update the undo / redo queue; albeit on a character by character basis. Your other choice is to implement your own undo/redo stack using an alternate keybinding each time you perform a replacement.

Either way it should be fairly easy.

Robert Abitbol Typing like you do in Word with the auto-correct on is strange but innovative (like all strange stuff :-)). But does this really save time? Can you think of something like this for translation?

LV I use a similar feature in vi - I have a variety of strings set up to remap to correct spellings, or URLs, or whatever.

The one thing that has to be available, though, is a way to tell the process 'stop that - leave what I typed alone'. Sometimes vi makes it a bit cumbersome - I have to type a few letters, stop, backup and then type the rest, etc.

On 20 Oct 2005, LES finally surrenders to Tkabber. But not before implementing auto correction. Here is the recipe:

1) Download Patt Hoyts' version of Tkabber. I don't remember the link right now.

2) Follow the instructions provided at [L1 ] and unwrap the Starkit.

3) Download this autocorrect list [L2 ] and save it somewhere.

4) Open /tkabber.vfs/main.tcl and add these two procs:

 proc loadAC  {}  {
     set _fp [open "/path/to/autocorrect.txt" r]
     set _slurp [read $_fp]
     close $_fp
     array set ::ACLIST [split $_slurp "\n="]
     unset _slurp

 proc autocorrect argW        {

      set _trail [$argW get 1.0 insert]
      set _typeString {[^"'(\[ ]+}

      for {set _c 1}  {$_c <= 5}  {incr _c}        {

          set _lastWord ""
          regexp -line  "($_typeString)\$"  $_trail  =>  _lastWord
          set _lastWordWipeSize [string length $_lastWord]

          if  {[array get ::ACLIST $_lastWord] != ""}        {
              $argW delete "insert -$_lastWordWipeSize c" insert
              $argW insert insert "$::ACLIST($_lastWord)" 
          if  {[array get ::ACLIST [string tolow $_lastWord] ] != ""}  {
              $argW delete "insert -$_lastWordWipeSize c" insert
              $argW insert insert [string toupper $::ACLIST([string tolower $_lastWord])] 
          set _typeString  "$_typeString \[^ \]+"

5) Now open the file /tkabber.vfs/tkabber/chats.tcl and look for the first bind $cw.input line and add these lines:


 foreach i  { space  .  ,  :  ;  ?  !  \"  '  =  (  )  [  ] }          { 
     bind $cw.input <Key-$i> {autocorrect %W}
 bind $cw.input <Control_L><F11> {loadAC}

6) Now the following characters trigger the automatic expansion of the aliases: space . , : ; ? ! " ' = ( ) [ ]

7) Whenever you change the autocorrect.txt file, press Ctrl+F11 and the file/list will be reloaded.

Category GUI - Category Editor utility - Category Word and Text Processing