Parsing, Bytecodes and Execution

Parsing, Bytecodes and Execution is a collection of ideas about tcl bytecodes and the parsing and execution framework of the core.

Description

Karl Lehenbauer started this with a post to Usenet asking about a Tcl Assembly Language, which can be assembled into Tcl bytecodes. In Tcl Assembly Language , 2000-06-12, Karl's post didn't make it into the archive, just my answer. Please add the reference when it becomes available.

As of 8.5, ::tcl::unsupported::assemble and ::tcl::unsupported::disassemble are available. This devalues all the Tcl compilers aiming at hiding script code, of course.

LV: Only if the recreated code was usable for much of anything...

Paul Duffin asked for a mechanism for me to be able to byte compile my own commands. This needs the ability to register a byte code as mine, with associated callbacks etc and be called when my byte code is executed with full access to the execution framework.. Couldn't find Paul's post on Deja too, so please add the reference when it becomes available.

Andreas Kupries: The only thing existing in this area is my ParseTools package [L1 ] which makes the basic parser of Tcl available at the script level. An obvious extension would be a parser using this to recurse down into subscripts (if, while, ...). Preferably configurable via a script defining the argument types of the available commands and not having this hardwired into it.

Andreas Kupries: Now that TclPro is open-source its tclparser package is another possibility.


XOTcl experiments with certain byte-code extensibility [L2 ].


Joe English has ideas on byte-code optimizations, as of spring 2002.

MS too, see MS's bytecode engine ideas.

KBK also, see The anatomy of a bytecoded command.


KBK: "Believe it or not, Tcl's regular nature means that saving byte codes isn't actually a win; reading byte codes using the loader is actually slower than generating them afresh from Tcl code.

Nevertheless, the byte code dump/load process is useful for those who don't wish to divulge their Tcl source code."

AK: Note that the TclPro bytecode dumper/loader converts a lot of the binary structures using a variant of the ascii85 encoding, thus burning cycles which could be spent better. Because of this it is not yet definitive to me that "saving bytecodes is no win".


elfring 2003-11-04: How do you think about a compilation into the code format ".NET intermediate language" [L3 ]? Would you like to cooperate with the projects DotGNU [L4 ] or Mono?

DKF: I try not to think about .NET and related stuff (why not just run Tcl/Tk on the native machine and contact the .NET runtime via COM?) But don't let my opinions constrain you from reinventing the wheel here...

somebody nameless: Unfortunately, .NET MSIL Code is not really language independent. You can use it for C, C#, Pascal, Fortran or Java, but it will be not good for Smalltalk, Perl, Scheme or Tcl (not to say about Prolog). You can write of course Tcl interpreter in MSIL but compile Tcl to MSIL is not possible. Or you would only compile parts of Tcl code another were too dynamic (evals) and must be interpreted. MSIL do not support that all Tcl-specific string operations, `eval and other dynamic behavior of Tcl. MSIL wants type safe code and can use only it. There is a Smalltalk (smallscript) compiler for .NET (Smalltalk has also no types at compile time). It works so that it duplicate all methods for every possible object type call.

The idea to have one byte code for every possible language is not possible.

elfring 2003-11-05: What is so specific in the anatomy of a bytecoded command that another code format can not be used? How do you think about an open programming interface to switch between code generators and compilers?

DKF 2003-11-06: Try it and find out! It's just not on my priority list. (Have fun with getting compact ways of dealing with variables...)


Zarutian 2004-07-26: Here is an idea I don't know where to put: don't use switch to parse the bytecodes; use a binary tree.

Lars H: What would be the point of that? Finding something in a binary tree requires a time proportional to the logarithm of the size of the tree. A C switch is normally implemented as a jump to an address stored in a vector, which (barring cache misses and the like) is a constant time operation. (The Tcl switch command is different though, and could perhaps be outperformed by a well-structured binary search, but that has nothing to do with bytecode.)


RHS: I figured this would be a good place to link to a new page about RHS's Bytecode Package


"Tcl bytecode optimization: some experiences" [L5 ]


jima 2011-06-13: Perhaps Catenation and Operand Specialization for Tcl Virtual Machine Performand , a Masters thesis by Benjamin Vitale, 2004, is also well placed here.