Version 20 of grammar_peg

Updated 2011-01-21 19:41:21 by AK

Actually several packages:

  • grammar::peg - Construction and manipulation of parsing expression grammars
  • grammar::peg::interp - Interpreter for parsing expression grammars.


3-9-8: I hope the following may help others who are trying to use the "grammar::peg" package. These were tested under "etcl" (Evolane Tcl/Tk Engine Version 1.0-rc26 See ).

1% # For now, just load the 'grammar::peg^ package
2% package require grammar::peg

3% #initialize a new (abeit empty) grammar.  The start symbol will be, by default, 'epsilon^.
4% grammar::peg myGrammar

5% # add a nonterminal to it
6% ::myGrammar nonterminal add myDigit {/ {t 0} {t 1} {t 2}}

7% # show its serialization
8% ::myGrammar serialize
grammar::pegc {myDigit {/ {t 0} {t 1} {t 2}}} {myDigit value} epsilon

9% # change the start symbol to the nonterminal 'myDigit^
10% ::myGrammar start {/ {n myDigit}}

11% # show its serialization
12% ::myGrammar serialize
grammar::pegc {myDigit {/ {t 0} {t 1} {t 2}}} {myDigit value} {/ {n myDigit}}

13% # Now we show how you could, instead, have started with a list representing
14% # the grammar. Since "myGrammar" exists, we will make "myGrammarFromList".
15% package require struct::list

16% # (Incidentally, Andreas Kupries states: 'struct::list' is a package in Tcllib, like grammar::peg, and
17% #                  guesses that grammar::peg forgot to 'package require' it.)
18% set mySerialization [list grammar::pegc \
> {myDigit {/ {t 0} {t 1} {t 2}}} {myDigit value} {/ {n myDigit}} \
> ]
grammar::pegc {myDigit {/ {t 0} {t 1} {t 2}}} {myDigit value} {/ {n myDigit}}

19% grammar::peg myGrammarFromList deserialize $mySerialization

20% # Next we attempt to use (interpret) the grammar on an input string.
21% #Load all relevant packages (this may be overkill)
22% set pattern {grammar::me|grammar::peg}; foreach packageName [package names] {if [regexp $pattern $packageName] {puts "[catch {package require $packageName}]  $packageName"}}
0  grammar::peg::interp
0  grammar::peg
0  grammar::me::tcl
0  grammar::me::cpu
0  grammar::me::util
0  grammar::me::cpu::core
0  grammar::me::cpu::gasm

23% # Initialize the interpreter.
24% grammar::peg::interp::setup ::myGrammar
25% # As of 3-9-8, that's as far as I have gotten.  The documentation at
26% # suggests using 
27% #                            ::grammar::peg::interp::parse nextcmd errorvar astvar 
28% #" which "interprets the loaded grammar and tries to match it against the stream of characters represented by the command prefix nextcmd".
29% # so I put the string "102" in "myFile.txt" and tried the following, but, as you can see, was unsuccessful.

30% set f [open c:/myFile.txt r+]
31% chan configure $f -encoding cp1252
32% set offset 0
33% ::grammar::peg::interp::parse $f myErrorVar myAstVar
invalid command name "file37bf6c8"
34% chan close $f

JBR I've made my own attempt to create a grammar::peg example. Maybe someone who knows can make this example work. Thanks.

 #!/usr/bin/env tclsh8.6

 package require grammar::peg
 package require grammar::peg::interp

 proc parse-string { string } {
    coroutine next-char apply { { string } {

        set i 1
        foreach ch [split $string {}] {
            yield [list $ch 1 1 $i]

            incr i

        while { 1 } { yield {} } } } $string

 ::grammar::peg parser

 parser nonterminal add Digit { / { t 1 } { t 2 } { t 3 } { t 4 } { t 5 } }
 parser nonterminal add Int   { + { n Digit } }
 parser start { n Int }

 parse-string 54321

 grammar::peg::interp::setup parser
 grammar::peg::interp::parse next-char err ast

When run I get this:

 wrong # args: should be "ict_match_token tok msg"
    while executing
 "ict_match_token "Expected $ch""
    (procedure "MatchExpr" line 40)
    invoked from within
 "MatchExpr $e"
    (procedure "MatchExpr" line 236)
    invoked from within
 "MatchExpr $ru($nt)"
    (procedure "MatchExpr" line 75)
    invoked from within
 "MatchExpr $sub"
    (procedure "MatchExpr" line 159)
    invoked from within
 "MatchExpr $ru($nt)"
    (procedure "MatchExpr" line 75)
    invoked from within
 "MatchExpr $se"
    (procedure "grammar::peg::interp::parse" line 9)
    invoked from within
 "grammar::peg::interp::parse next-char err ast"
    (file "./ISBL" line 30)

This appears to be a bug in grammar::peg::interp I changed line 116 of peg_interp.tcl to:

 ict_match_token $ch "Expected $ch"

Then changing my coroutine to return EOF multiple times after the input is exhausted the string parses returning:

 ALL 0 4 {Int 0 4 {Digit 0 0 {{} 0 0}} {Digit 1 1 {{} 1 1}} {Digit 2 2 {{} 2 2}} {Digit 3 3 {{} 3 3}} {Digit 4 4 {{} 4 4}}}

Now to learn how to do something useful with this. The input tokens don't appear to be represented in the output AST?

AK - 2011-01-21 14:40:21

Regarding the representation of input tokens. No, they are not represented directly. The numbers in the AST tell you the character range covered by the symbol, as offsets from the beginning of the string. This allows you to extract the lexemes/token from the input string. See for an example of the tree and its contents.