taccle -- '''T'''accle is '''A'''nother '''C'''ompiler '''C'''ompi'''le'''r taccle is a complement to [fickle] in that it reads a ''taccle specification file'' to generate working LALR(1) parser. In other words, it is to Tcl what ''yacc'' (or ''bison'') is to C/C++. taccle differs from [yeti] in that the grammar is written before hand as a straight text file rather than generated by procedure calls. Unlike tyacc [http://www.bodenstab.org/] taccle is written in pure Tcl. ([FP]: Nice. At the time, I wanted to write a supplementary program for [yeti] to parse a yacc-like input file and produce a parser from that. I never got around to it, as I wanted to write the program twice. Once, based on [yeti] primitives, and second, using itself -- the lithmus test for every compiler is to compile yourself ;)) taccle spec files are structured nearly identical to those used by yacc. The following example (blantantly stolen from chapter 8 of ''lex & yacc''[http://www.oreilly.com/catalog/lex/]) may be interpreted equally by the two: %token A R %% start: x | y R; x: A R; y: A; Incidentally both yacc and taccle would recognize the shift/reduce conflict above. '''Origin of taccle''' I originally wrote taccle to learn about the fun world of LALR(1). As that I do not have a PhD in automata theory the LALR(1) generation is a bit inefficient; I calculate the canonical LR(1) and then merge cores. For those learning parsing theory and [syntax parsing in tcl] taccle can be set to display every step. taccle calculates seven states in the canonical LR(1): lr(1) table: state 0: start' -> . start, {$} start -> . x, {$} start -> . y R, {$} x -> . A R, {$} y -> . A, R transitions: A => s1 start => s2 x => s3 y => s4 state 1: x -> A . R, {$} y -> A ., R transitions: R => s5 state 2: start' -> start ., {$} state 3: start -> x ., {$} state 4: start -> y . R, {$} transitions: R => s6 state 5: x -> A R ., {$} state 6: start -> y R ., {$} '''{$}''' is the standard end of input symbol. Converting to LALR(1) gains nothing in this case. The fully generated parse table is: generated lalr(1) parse table: state A R start x y $ 0 sh1 go2 go3 go4 1 sh5 2 accept 3 re1 4 sh6 5 re3 6 re2 Note how taccle resolved the shift/reduce conflict in state 1 by giving precedence to shift. Running bison 1.35 on this same input file results in 9 states. Therefore taccle is 23% more efficient than bison. '''' '''A Practical Example''' Here is another example. The file has been compacted to make it better fit on the web page. %{ #!/usr/bin/tclsh %} %token ID %start start %% start: E { puts "Result is $yy_1" } ; E: E '+' T { set yy_ [expr {$yy_1 + $yy_3}] } | E '-' T { set yy_ [expr {$yy_1 - $yy_3}] } | T ; T: T '*' F { set yy_ [expr {$yy_1 * $yy_3}] } | T '/' F { set yy_ [expr {$yy_1 / $yy_3}] } | F ; F: '(' E ')' { set yy_ $yy_2 } | ID { set yy_ $::yylval } ; %% source simple_scanner.tcl; yyparse This is, of course, the infamous calculator example. Several new things are presented here: * There is a literal block ('''#!/usr/bin/tclsh''') and a user subroutine section ('''source simple_scanner.tcl; yyparse'''). * The start state is explicitly given; by default taccle (like yacc) uses the first rule as a start state. * Associated with most of the rules are ''actions'' to take when that rule gets reduced. Here is where taccle deviates from yacc. Whereas yacc uses '''$1''', '''$2''', and so forth as token attributes, taccle declares variables '''yy_1''', '''yy_2''', etc. In addition the synthesized attribute '''$$''' has been renamed '''yy_''' in taccle. * The rules without explicitly declared actions use the default one '''set yy_ $yy_1'''. This is analogous to yacc's '''$$ = $1;'''. * Yacc declares a global variable '''yylval''' typically as a C union. Because Tcl does not have unions, or for that matter variable types (for the most part), the '''%union''' and '''%type''' keywords are unnecessary. This example assumes that the token generator places into '''::yylval''' attributes of the correct type (e.g., '''ID''' will not have a non-numeric attribute). Sorry, you'll have to do your own error checking. * This example does not make use of '''[[pkg_mkIndex]''' so it is necessary to source in the token generator. In this case the generator was developed used [fickle]. There are some things taccle cannot handle. These are on my TODO list: * epsilon transitions * embedded actions (basically, if I can convert NDFA to DFA I'll be able do this too) * error recovery (the '''error''' token, '''yyerrok''' and so forth) * detect infinitely recursive rules (e.g., ''foo -> foo '42' ;'') * inherited attributes (synthesized attributes are easy, inherited not so) * operator precedence ('''%left''', '''%right''', etc) ---- '''Downloads''' taccle is protected by the GNU General Public License. You should read the ''README'' file before use; a complete set of examples are in the ''examples'' subdirectory. Familiarity with the Dragon book as well as ''lex & yacc'' would also prove useful. Version 0.1 may be obtained at http://tcl.jtang.org/taccle/taccle-0.1.tar.gz. ---- Comments below: ---- Return to [Jason Tang] ---- [Category Application] | [Category Dev. Tools]