Statically verifying arithmetic and regular expressions

Arjen Markus (18 august 2007) Inspired by an article by Diomidis Spinellis and Panagiotis Louridas, A framework for the static verification of API calls (Journal of Systems and Software, vol. 80, (2007), 1156-1168)[L1 ] I thought how can this idea be applied to Tcl?

One of the problems with static analysis of source code is that it may contain strings belonging to one or more domain-specific languages (DSL). For basic Tcl you can think of the format strings to scan and format, but also of regular and arithmetic expressions. If you use extensions that deal with XML or SQL, then there is a host of other domain-specific languages to worry about.

The article describes a way to verify API calls with such strings and to solve other problems that occur with analyzing/verifying source code. They specifically refer to Java programs, but with Tcl you get similar problems.

Now, the program below is just a simple attempt to check arithmetic expressions. It illustrates the idea that we let Tcl do the work for us.

You could write a program to verify the syntax of an arithmetic expression or another DSL, but that is hard work (that is: it is no fun job) and the chances of making a mistake are very large. Besides, whenever there is an extension to the DSL, you would have to revise your program. And Tcl or any extension are already checking the syntax. So why not let Tcl do the tough job?

There is one drawback though: it has to analyze one error message from the expr command. If that message changes, then the program may break.


 namespace eval ::Verification {
     # Define the namespace
 }

 # validArithmeticExpr --
 #     Verify that an arithmetic expression has the correct syntax
 #
 # Arguments:
 #     expression  String containing an arithmetic expression
 #
 # Result:
 #     List with three elements:
 #     First: 1 if the string is correct, 0 otherwise
 #     Second: first line of errorInfo
 #     Third: errorCode
 #     (For detailed error reporting)
 #
 proc ::Verification::validArithmeticExpr {expression} {

     #
     # The expression will contain references to variables -
     # catch them and fill in some default value
     #
     set ::Verification::result 1
     while {1} {
         catch {unset ::errorCode}
         catch {unset ::errorInfo}
         if { [catch {
             expr $expression
         } ::Verification::msg] } {
             if { [regexp {can't read "([^"]+)"} $::Verification::msg ==> varname] } {
                 set $varname 2
                 unset varname  ;# Avoid local variables
             } else {
                 set ::Verification::result 0

                 #
                 # Arithmetic errors can be artifacts due to our substuting
                 # value - either ignore them or report them, there could
                 # be a problem in the expression itself
                 #
                 # Uncomment if you want to report them anyway
                 # if { [lindex $::errorCode 0] != "NONE" } {
                 #      set ::Verification::result 1
                 # }

                 break
             }
         } else {
             break
         }
     }

     if { $::Verification::result } {
         return [list 1 "" ""]
     } else {
         return [list $::Verification::result [lindex [split $::errorInfo \n] 0] $::errorCode]
     }
 }

 # validRegularExpr --
 #     Verify that an regular expression has the correct syntax
 #
 # Arguments:
 #     re          String containing a regular expression
 #
 # Result:
 #     List with three elements:
 #     First: 1 if the string is correct, 0 otherwise
 #     Second: first line of errorInfo
 #     Third: errorCode
 #     (For detailed error reporting)
 #
 proc ::Verification::validRegularExpr {re} {

     #
     # The regular expression might contain references to variables -
     # but they need to be identified and dealt with in the surrounding
     # code! So, ignore that possibility
     #
     set ::Verification::result 1
     if { [catch {
         regexp $re ""
     } ::Verification::msg] } {
        set ::Verification::result 0
     }

     if { $::Verification::result } {
         return [list 1 "" ""]
     } else {
         return [list $::Verification::result [lindex [split $::errorInfo \n] 0] $::errorCode]
     }
 }

 # main --
 #     Test a few expressions
 #
 foreach e {
     {$a+$b}
     {$a+b}
     {$a/$b}
     {$a/unknown($b)}
     {$a/log($b)}
     {$a*atan2($b)}
 } {
     set result [::Verification::validArithmeticExpr $e]
     set valid  [lindex $result 0]
     puts "$e: [expr {$valid? "valid": "invalid"}] - [lindex $result 1]"
 }

 foreach e {
     {[0-9]+}
     {[}
     {\[}
     {A literal.*string}
 } {
     set result [::Verification::validRegularExpr $e]
     set valid  [lindex $result 0]
     puts "$e: [expr {$valid? "valid": "invalid"}] - [lindex $result 1]"
 }