Obfuscation

Much as I (DL) enjoy obfuscation, I had long thought the idea has no place in Tcl - the syntax just doesn't lend itself to obscurity. Tcl has very little dependence on context, overloading, etc. in contrast to, say, C and Perl where things like context are everything. Yes, there are a couple of nuances like when # is really a comment and when it isn't. And yes, you can redefine basic commands like "if" and "proc". Hmm. Gosh, maybe I was shortsighted in my original belief. Maybe I was just in denial. All of a sudden, the language does present some interesting possibilities.

In fact, obfuscation doesn't just mean perverting the language for perversion's sake. Indeed, as I said in the BOOK Obfuscated C and Other Mysteries, sometimes one must sacrifice readability for performance. In scripting environments, obfuscated code has become useful for source protection (or at least for making it hard for cavalier code changes).

On the other hand, a good laugh is a wonderful thing. And when Jeffrey Hobbs requested an obfuscator as one of the questions at the Tcl contest at Tcl2K, I decided to see if I could write one that was really, really short (since I was at the conference at the time and was really, really tired) while still producing output that was not only opaque but amusingly bizarre.

Since one of the reasons for Obfuscation contests is for people to figure out these little gems on their own, I'll decline from explaining all the nuances of this except to say:

- Although the basic obfuscation is conceptually simple, the obfuscated representation is rather surprising. Also, a different obfuscation is produced each time even if the input is the same.

- Use of a package wasn't to impress the judges (although using a different namespace was :-). Rather, the package provided the ability to deobfuscate the string in a different interp than the one that had obfuscated it in the first place. (The alternative method was to embed the algorithm in the string itself. I deemed this technique to be too readable.)

The code:

 namespace eval o {
    proc -) {k s} {
        foreach c [split $s ""] {
            scan $c %c c
            incr c $k
            append buf [format %c $c]
        }
        return $buf
    }

    proc obfuscate {s} {
        set k [expr {int(rand()*255+1)}]
        return "package require obf;o::-) -$k [list [-) $k $s]]"
    }
 }

 package provide obf 1.0

If you want to run this, save it in a file called "obf" along with pkgIndex.tcl:

 package ifneeded obf 1.0 [list source [file join $dir obf]]

Put both files in a directory called "obf1.0" and make sure it is somewhere in your $tcl_pkgPath.

Use the o::obfuscate proc to obfuscate. The resulting string can be eval'd to get the unobfuscated form.

 % set x [o::obfuscate "hello world"]
 % eval $x

For more amusement, see Braintwisters and Quines.


Adapting the above code to be more obscure...

 namespace eval o {
     proc -) {k s {f {}}} {
         binary s $s c* s; foreach {{
         #}}   $s {lappend f [incr {
         #}    $k]  ;^P}
         binary f c* $f
     }

     proc ^P {} {upvar k x;set x [expr {($x>0?1:-1)*(abs($x)%255+1)}]}

     proc obfuscate {s} {
         set k [expr {int(rand()*255+1)}]
         format "package r obf;o::-) -$k %s" [list [-) $k $s]]
     }
 }

 package pro obf 1.0

Suggestions for making things even worse: try introducing a [map] operation in there so as to get rid of the blindingly obvious [foreach] and [lappend] operations. An obscurer obscurer must be our ultimate goal... DKF

( I changed "package p" to "package pro" as "p" is ambiguous in tcl8.3. Perhaps the use of short form arguments is dangerous practice for forward compatibility? -JCE)

KBK (7 November 2000) All right, Donal, here's a version of the 'obf' script with [string map]. Note that the leading spaces were not just added for the formatter: they're significant. Cut and paste it exactly verbatim!


 namespace eval o [string map {{ } {  } ! et {"} nc # \]\} {$} { #}
 % { c} & { $} ' ex ( fo ) \}\  * -1 + {;s} , { f} - \{u . {h } / (a 
 0 oc 1 { k} 2 {($} 3 ac 4 { -} 5 P\} 6 \{\{ 7 {{}} 8 { p} 9 {) } 
 : 55 {;} { s} < { x} = {* } > )* ? ar @ \ \{ A bi B {c } C {[i} 
 D \{\n E ^P F pp G k\] H x) I \}\} J pr K {r } L {

 } M {d } N bs O c* P pv Q re R {$f} S {s } T la U )\} V 0? W \}\n 
 X {;^} Y ro Z 1: {[} {; } \\ +1 \] x> ^ {f } _ {$s} ` #\} a {
 } b { [} c ry d en e \{k f %2 g na {
 } {}} {a  J049e;@^7I@a    Agc;&SO;[(Q3.6a    `) _@TFdM^C"KD    $
 ) &G X5a    Agc,%=Ra  W  8YBE@)-P?1<+!<b'J@2]VZ*>/N2Hf:\U#L}]
 namespace eval o [string map {{ } {  } ! %s {"} \}\] # { "} {$} { $}
 % fo & s\} ' 5+ ( ob ) oc * {]
 } + { o} , us - ag . pa / {f;} 0 {) } 1 nt 2 {" } 3 an 4 bf 5 {[e} 
 6 (r 7 \ \{ 8 )* 9 ca : \ \} {;} at < {[l} = {[-} > o: ? pr @ {r } 
 A 1) B {k } C {

 } D is E ck F s\] G xp H d( I {e } J rm K {$k} L se M {t } N 25 
 O :- P te Q {-$} R {
 } S \{i {
 } {}} {R  ?)+4,9P7&7R    LMB5G@S163H8N'A"R    %J;#.E-I@(/>O0QB!2<DM
 =0K$F*  :C}]
 eval [string map {{ } {  } ! { p} {"} ro # { 1} {$} ka % ge & {

 } ' bf ( {
 } ) ac * .0 + { o} {
 } {}} {( !)$%!"+'#*&}]

 In article <[email protected]>, [Phil Ehrens] writes
 >You have a secret agenda here, don't you?
 >It's the obfuscated Tcl contest isn't it!?

 Damn!  Rumbled!
                foreach   ??   [  set   ?!   102   ;
                set   !?   -280   ;   set   !!   272
                split   Perl   {}   ]   {   scan   [
     set  ??]  %c  ??;   puts -nonewline [   binary  format  c  [
     incr  ??  [  incr  ?!  [  incr  !?  [  incr  !!  -90  ]]]]]}

Donal Fellows (this posting brought here by RS)

Or even...

  foreach ?? [set    ?! 102; set   !?\
      -280;        set             !!\
       272;       split            Perl\
        {}]       {scan            [set\
        ??]         %c             ??;
       puts          -nonewline    [binary format\
   c [incr ?? [incr ?! [incr !? [incr !! -90]]]]]}

LES Wow. This is really impressive indeed. Except that I can't seem to run much more than one-liners. Bigger scripts fail - and debugging them is impossible.

DKF Debugging?! This stuff is art! It doesn't need to be associated with stuff as mundane as mere bugs! That it reboots your cat and puts the development machine out for the night is an amusing feature!


DKF - A short word on the basic principle behind the previous piece of code.

It works by computing valuations of a polynomial by the method of differences. These values are then added to the numeric value of each character in a string to produce a new string. The number of terms you need in the polynomial is proportional to the number of characters you wish to transform. Luckily, I've got some code to calculate these coefficients (using the method of differences.) It is very hard to calculate these by hand...

 proc calculateCoefficients {fromString toString} {
    if {[string length $fromString] != [string length $toString]} {
       return -code error "input strings must be same length"
    }
    binary scan $fromString c* from
    binary scan $toString c* to

    puts "fill (nearly all) top line of array"
    for {set i 0} {$i<[llength $from]} {incr i} {
       set x(0,[expr $i+1]) [expr [lindex $to $i]-[lindex $from $i]]
    }

    puts "fill top right of array"
    for {set j 1} {$j<[llength $from]} {incr j} {
       for {set i [llength $from]} {$i>=$j+1} {incr i -1} {
          set x($j,$i) [expr $x([expr $j-1],$i)-$x([expr $j-1],[expr $i-1])]
       }
    }

    puts "fill bottom row of array"
    incr j -1
    set v $x($j,[llength $from])
    for {set i 0} {$i<[llength $from]} {incr i} {set x($j,$i) $v}

    puts "fill rest of array"
    for {incr j -1} {$j>=0} {incr j -1} {
       for {set i [llength $from]} {$i>=0} {incr i -1} {
           if {[info exist x($j,$i)]} {continue}
           set x($j,$i) [expr $x($j,[expr $i+1])-$x([expr $j+1],[expr $i+1])]
       }
    }

    puts "extract coefficients"
    set result {}
    for {set j 0} {$j<[llength $from]} {incr j} {
       lappend result $x($j,0)
    }

    return $result
 }

For example, suppose we wish to convert Python to Tcl/Tk (chosen because they are the same length of string; that makes things much easier.) We just feed these in to the above code, and it spits out what the polynomial initialisers are.

 % calculateCoefficients "Python" "Tcl/Tk"
 fill (nearly all) top line of array
 fill top right of array
 fill bottom row of array
 fill rest of array
 extract coefficients
 890 -3755 6539 -5803 2605 -472

Now we can easily write the Tcl code to perform the transformation.

  set a   890
  set b -3755
  set c  6539
  set d -5803
  set e  2605
  set f  -472
  binary scan Python c* chars
  set result {}
  foreach x $chars {
      append result [format %c [incr x [incr a [incr b [incr c \
              [incr d [incr e $f]]]]]]]
  }
  puts $result

I leave actually obfuscating this as an exercise to the reader.


DKF: A more general mechanism of applying these polynomials to strings is:

 proc applyCoefficients {coeffs string} {
    set idx [binary scan $string c* vals]
    foreach coeff $coeffs {set x([incr idx]) $coeff}
    foreach v $vals {
       for {set i $idx} {$i>2} {incr i -1} {incr x([expr $i-1]) $x($i)}
       lappend result [incr v $x(2)]
    }
    return [binary format c* $result]
 }

Example use:

 set cs [calculateCoefficients "Python" "Tcl/Tk"]
 puts [format "p(%s)(\"%s\") = \"%s\"" [join $cs ,] "Python" \
         [applyCoefficients $cs "Python"]]

Try applying the polynomial {-5334 33545 -91431 139782 -129083 71902 -22352 2990} to the string "VBScript"... :^)


  eval [string map {+ " " - ;} puts+hello-puts+world] ;# RS

DKF in the Tcl chatroom on 2002-12-05:

 [set for ever; set $for for] "set $ever$for $for$ever" "$$ever$for ne {now}" "vwait $ever$for" {}

See also Super Code for a different way of making the language unnecessarily (or is it really necessarily? ;^)) obscure...


MJ - Was shown on the Tcl-chat by MS

 set set set; [$set $set] $set $set

Of course because now every set or $set can be replaced by [set set] the following is valid:

 [[[[set set] [set set]] [[set set] [set set]]] [[[set set] [set set]] [[set set] [set set]]]]\
 [[[[set set] [set set]] [[set set] [set set]]] [[[set set] [set set]] [[set set] [set set]]]]\
 [[[[set set] [set set]] [[set set] [set set]]] [[[set set] [set set]] [[set set] [set set]]]]

Repeat until noxious.


APW - Was (mostly complete) shown on the Tcl-chat by MJ

 interp alias {} || {} set
 set | set
 set ? 42
 puts [[[[|| |] [|| |] [|| |]] [[|| |] [|| |] [|| |]] [[|| |] [|| |] [|| |]]] ?]

 % 42

Lars H: Why not do away with ||, by doing instead

 interp alias {} {} {} set
 puts [[[[[] |] [[] |] [[] |]] [[[] |] [[] |] [[] |]] [[[] |] [[] |] [[] |]]] ?]

? Alternating between [] and {} for the empty string seems useful for confusing people trying to understand an obfuscated Tcl script.

MJ - That's evil :-), note that with [set {} set] this can even be written as a lot of nothing:

 % set {} set
 % set ? 42
 % interp alias {} {} {} set
 % [[[[] []] [[] []] [[] []]] [[[] []] [[] []] [[] []]] [[[] []] [[] []] [[] []]]] ?
 42

DKF: This gem by KBK is both clear and very obscure in how it goes about it. tailcall and apply let you do some evil tricks.

proc fib {x} {
    tailcall fibcps $x {{x} {return $x}}
}
proc fibcps {x cont} {
    if {$x <= 1} {
        tailcall apply $cont 1
    } else {
        tailcall fibcps [expr {$x-1}] \
            [list [list y [list x $x] [list cont $cont]] {
                tailcall fibcps [expr {$x-2}] \
                    [list [list z [list y $y] [list cont $cont]] {
                        tailcall apply $cont [expr {$y+$z}]
                    }]
            }]
    }
}
for {set i 0} {$i < 10} {incr i} {
    puts [fib $i]
}

RLE (2014-02-03): Correct me here if I'm wrong, but this above looks like Tcl written in Continuation Passing Style (http://en.wikipedia.org/wiki/Continuation-passing_style ).


For functional obfuscation, see mktclapp, tbcload, procomp, source protection, ...

See also playing with obfuscation, commenting and obfuscating, Code Golf Saving Time