taccle -- '''T'''accle is '''A'''nother '''C'''ompiler '''C'''ompi'''le'''r taccle is a complement to [fickle] in that it reads a ''taccle specification file'' to generate working LALR(1) parser. In other words, it is to Tcl what ''yacc'' (or ''bison'') is to C/C++. taccle differs from [yeti] in that the grammar is written before hand as a straight text file rather than generated by procedure calls. Unlike tyacc [http://www.bodenstab.org/] taccle is written in pure Tcl. taccle spec files are structured nearly identical to those used by yacc. The following example (blantantly stolen from chapter 8 of ''lex & yacc''[http://www.oreilly.com/catalog/lex/]) may be interpreted equally by the two: %token A R %% start: x | y R; x: A R; y: A; Incidentally both yacc and taccle would recognize the shift/reduce conflict above. '''Origin of taccle''' I originally wrote taccle to learn about the fun world of LALR(1). As that I do not have a PhD in automata theory the LALR(1) generation is a bit inefficient; I calculate the canonical LR(1) and then merge cores. For those learning parsing theory and [syntax parsing in tcl] taccle can be set to display every step. taccle calculates seven states in the canonical LR(1): lr(1) table: state 0: start' -> . start, {$} start -> . x, {$} start -> . y R, {$} x -> . A R, {$} y -> . A, R transitions: A => s1 start => s2 x => s3 y => s4 state 1: x -> A . R, {$} y -> A ., R transitions: R => s5 state 2: start' -> start ., {$} state 3: start -> x ., {$} state 4: start -> y . R, {$} transitions: R => s6 state 5: x -> A R ., {$} state 6: start -> y R ., {$} '''{$}''' is the standard end of input symbol. Converting to LALR(1) gains nothing in this case. The fully generated parse table is: generated lalr(1) parse table: state A R start x y $ 0 sh1 go2 go3 go4 1 sh5 2 accept 3 re1 4 sh6 5 re3 6 re2 Note how taccle resolved the shift/reduce conflict in state 1 by giving precedence to shift. Running bison 1.35 on this same input file results in 9 states. Therefore taccle is 23% more efficient than bison. '''' '''A Practical Example''' Here is another example. The file has been compacted to make it better fit on the web page. %{ #!/usr/bin/tclsh %} %token ID %start start %% start: E { puts "Result is $1" } ; E: E '+' T { set _ [expr {$1 + $3}] } | E '-' T { set _ [expr {$1 - $3}] } | T ; T: T '*' F { set _ [expr {$1 * $3}] } | T '/' F { set _ [expr {$1 / $3}] } | F ; F: '(' E ')' { set _ $2 } | ID { set _ $::yylval } ; %% source simple_scanner.tcl; yyparse This is, of course, the infamous calculator example. Several new things are presented here: * There is a literal block ('''#!/usr/bin/tclsh''') and a user subroutine section ('''source simple_scanner.tcl; yyparse'''). * The start state is explicitly given; by default taccle (like yacc) uses the first rule as a start state. * Associated with most of the rules are ''actions'' to take when that rule gets reduced. Here is where taccle deviates from yacc. Whereas yacc uses '''$$''' for synthesized attribute taccle calls the variable '''$_'''. * The rules without explicitly declared actions use the default one '''set _ $1'''. This is analogous to yacc's '''$$ = $1;'''. * Yacc declares a global variable '''yylval''' typically as a C union. Because Tcl does not have unions, or for that matter variable types (for the most part), the '''%union''' and '''%type''' keywords are unnecessary. This example assumes that the token generator places into '''::yylval''' attributes of the correct type (e.g., '''ID''' will not have a non-numeric attribute). Sorry, you'll have to do your own error checking. * This example does not make use of '''[[pkg_mkIndex]''' so it is necessary to source in the token generator. In this case the generator was developed used [fickle]. New with taccle version 0.2 are: * epsilon transitions * error recovery with the '''error''' token * rename all variables with -p option There are some things taccle cannot handle. These are on my TODO list: * embedded actions (basically, if I can convert NDFA to DFA I'll be able do this too) * detect infinitely recursive rules (e.g., ''foo -> foo '42' ;'') * inherited attributes (synthesized attributes are easy, inherited not so) * operator precedence ('''%left''', '''%right''', etc) ---- '''Downloads''' taccle is protected by the GNU General Public License. You should read the ''README'' file before use; a complete set of examples are in the ''examples'' subdirectory. Familiarity with the Dragon book as well as ''lex & yacc'' would also prove useful. Download taccle from below: * taccle 0.2 at http://tcl.jtang.org/taccle/taccle-0.2.tar.gz. * taccle 0.1 at http://tcl.jtang.org/taccle/taccle-0.1.tar.gz. ---- Comments below: ---- [FPX]: Nice. At the time, I wanted to write a supplementary program for [yeti] to parse a yacc-like input file and produce a parser from that. I never got around to it. I wanted to write the program twice: Once, based on [yeti] primitives, and second, using itself -- the litmus test for every compiler is to compile itself ;) ---- Return to [Jason Tang] ---- [Category Application] | [Category Dev. Tools]