Updated 2012-01-26 02:21:52 by AMG

Richard Suchenwirth 2002-12-01 - "Keep It Simple Sir!" This saying is often used and less often heeded. Its history may go back to William of Occam's razor and beyond: "If two things do the same job, simpler is better." (More precisely, "entia non sunt multiplicanda praeter necessitatem - (use) no more things than necessary"). Tcl is simpler than many other languages, but Tcl code can be more or less simple. I'm pondering the following rules of thumb:

  • Try to use less lines of code. Shorter code is easier inspected by humans, and run by Tcl interpreters.
  • Try to use less variables. Variables make sense when retrieved more than once, but again, less degrees of freedom allow less bugs.
  • Reinventing wheels is fun in Tcl, even more if you re-use existing functionality (spokes, hubs, ... ;-)
  • Try to depend on less. For instance, global variables have their uses, but doing without is often better.
  • Try to factor out common code. When snippets look suspiciously as if done by copy'n'paste, think on how it might be generalized in one place.
  • To sum up: Tcl'ing is fun. Less Tcl'ing may be more fun.

Re-reading Arbitrary precision math procedures prompted me to bring forth this sermon. The multiply improved code on that page certainly does a good job, but appeared to me less simple than I'd have liked. So I started rewriting it, and mostly found it could be done with about half the LOC and variables. One thing struck me: in almost each rewritten proc I found myself coding like
 if {$x == ""} {set x 0}

Simple enough, but still... I factored that operation into this fancy-named one-liner (which is minimal LOC):}
 proc empty=0 x {expr {$x == ""? 0: $x}}

which does not reset a variable, but in all cases I was interested in the value, and only once. Now for the revised code (it's about addition and multiplication of non-negative integers, that may be bigger than expr could stand). To turn such a "bigInt" into a "little-endian" list of chunks at base 10000 (i.e. max. 4 digits), I wrote:
 proc tolist bigint {
    set res {}
    while {[string length $bigint]} {
        lappend res [scan [string range $bigint end-3 end] %d]
        set bigint [string range $bigint 0 end-4]
    set res

Seems to work well:
 % tolist 1234567890001002003004005006007008009
 8009 700 60 4005 300 20 1 6789 2345 1

and uses 8 LOC and one local variable, instead of 29 LOC and 7 local variables in the original page. In the opposite direction, such chunk lists have to be formatted back to "big-endian" digit strings:
 proc fromlist chunks {
    set res ""
    foreach chunk $chunks {
        set res [format %4.4d $chunk]$res ;# prepend at left
    empty=0 [string trimleft $res 0] ;# [scan] might choke on big ints

7 LOC vs. 12, and apparently less buggy for leading zeroes (the original checked only for chunk==0). Needless to say which coding style I prefer. Now for the addition of two chunk lists:
 proc adda {xs ys} {
    set res {}
    set carry 0
    foreach x $xs y $ys {
        set sum [expr {[empty=0 $x] + [empty=0 $y] + $carry}]
        set carry [expr {$sum / 10000}]
        lappend res [expr {$sum % 10000}]
    if $carry {lappend res $carry} else {set res}

The last line is debatably tricky and could have been done as
 if $carry {lappend res $carry}
 set res

but uses the facts that if returns the result of the branch that succeeded, while lappend returns the resulting list. Anyway, 10 LOC compared to 24 originally, and much less local variables.

Finally, multiplication. DKFs original solution did it with repeated shifted addition; JJS's enhanced code (base-10000 chunks) split it into single-chunk multiplication (multi) and a wrapping loop multa. I follow that model, though again simplified:
 proc multa {xs ys} {
    set res {}
    foreach x $xs {
        set res [adda $res [multi $ys $x]]
        set ys [linsert $ys 0 [set ys 0]] ;# K-less K ;-)
    set res
 proc multi {xs chunk} {
    set res {}
    set carry 0
    foreach x $xs {
        set n [expr {wide($x) * $chunk + $carry}]
        set carry [expr {$n/10000}]
        lappend res [expr {$n % 10000}]
    if $carry {lappend res $carry} else {set res}
if 0 {And now for the factorial demo:}
 proc facta n {
    if {$n<2} {return 1} else {
     fromlist [multa $n [tolist [facta [incr n -1]]]]

Brevity in code is achieved here by the classic recursive form, which is more expensive in runtime compared to the equally possible iteration, but easier on the human eye. The break-even point between simple and fast code is however coming our way, as CPUs get faster every year.. facta 400 took 20 seconds on my 200MHz W95 box to produce the voluminous integer (line breaks added)

and I had to increase the recursion limit to 2000 for this, but on all checks it is equal to the result reported in the Ruby User's Guide which I took as reference, so "it must be right", as it says there ;-) Here's the iterative version:
 proc facta'it n {
    for {set i 1; set this 1} {$i <= $n} {incr i} {
        set this [multa $this [set this $i]]
    fromlist $this

Funny, and against my expectation, that the iterative version is slower than the recursive one:
 % time {facta 200}
 5055078 microseconds per iteration
 % time {facta'it 200}
 57408667 microseconds per iteration

The recursive version also suffers from repeated conversions from and to lists. Hence, a second try that uses slightly more code in the hope for less runtime:
 proc facta2 n {fromlist [facta' [tolist $n]]}
 proc facta' n {
    if {$n<2} {return 1} else {multa $n [facta' [incr n -1]]}

That was definitely worth the effort:
  % time {facta2 200}
 1652267 microseconds per iteration

See also Big integer routines.