Errors management

Fabricio Rocha - 08-Feb-2010 - Error treatment in programming always seems to be an underestimated topic, often untold by and to newbies, while it's a useful thing that might be naturally taught along with the basics in a programming language. Only after some two years of studying Tcl/Tk I was able to find some information about this subject and develop myself a very basic and limited idea of how applications can avoid being crashed by bugs or misuse, so I would like to discuss some error management techniques with the experienced folks, while building up a tutorial from this discussion (something highly useful by aspiring Tclers like me). And, please, treat the errors you find...

What is error management?

Error management is about handling (and sometimes deliberately causing) error termination in scripts. Error termination is one of the modes of termination (see Script termination: results and control) provided by Tcl. As with all modes of termination, the interpreter gathers three pieces of information about the termination: the return code (always 1 in the case of error termination), the result value (by convention, the result value consists of an error message in the case of error termination), and the return options dictionary (a dictionary structure, described under its own heading below).

Exceptions happen in an unplanned manner in programs: numeric operations overflow, a divisor value happens to be 0, a resource turns out to be unavailable, etc. Such events need to be dealt with to avoid the program crashing. Exceptions can, on the other hand, also be used as a form of flow control to e.g. abandon a deeply nested subroutine that can't complete its task for some reason, and withdraw directly to a save point location higher up in the call stack.

Which error-management features are provided by Tcl?

Exception raising commands

The following commands cause an error (raise an exception) that must be handled by code on an earlier level on the call stack: if the error isn't handled (see below), the program stops. Exception-raising commands can add to the error information gathered by the interpreter, as described in the following.

return

The return command has been around for a long time, but has only been able to raise exceptions since Tcl 8.5.

When used to raise an exception, the minimal invocation is return -code error. Instead of -code error you may write -code 1: it has the same meaning.

The value of the -code option is stored in the return options dictionary. Other options, such as -level, -errorinfo, -errorcode, -errorline, and -errorstack, may be added to the invocation of return and have their values stored in the return options dictionary too. The option -options, with a dictionary for value, may be used to set the return options dictionary all at once.

The return command always passes a return code equal to the value of the -code option (which, again, should be 1 or error when raising an exception) and a result value equal to the optional result parameter, which comes after any options given. It basically sets up the return options dictionary according to the invocation, with default values used if some options aren't provided.

See return and the man page for a more thorough description of the command.

error

The error command has been the primary exception-raising command since Tcl 8.4.

It can be invoked in three ways:

The invocation......is functionally equivalent to:
"Unary error"error messagereturn -code error -level 0 message
"Binary error"error message inforeturn -code error -level 0 -errorinfo info message
"Ternary error"error message info codereturn -code error -level 0 -errorinfo info -errorcode code message

The error command always passes a return code of 1 and a result value equal to the message parameter. It sets up the return options dictionary a bit differently depending on the invocation.

See error and the man page for a more thorough description of the command.

throw

The throw command was added in Tcl 8.6.

It is invoked like this:

throw type message

which is functionally equivalent to:

return -code error -level 0 -errorcode type message

The throw command always passes a return code of 1 and a result value equal to the message parameter.

See throw and the man page for a more thorough description of the command.

Dealing with an unknown command

This is a conditionally exception-raising command. When Tcl is asked to execute a command procedure it doesn't know, it first attempts to save the situation by invoking another command named unknown, which then goes through a multi-step procedure to find the command that was asked for. If it still fails to find the command that was invoked, it raises an exception. You can override this behavior by providing your own unknown procedure.

Since Tcl 8.5, there is also the namespace unknown command, which allows the programmer to name a procedure which will be called when a command/procedure lookup fails in the scope of a specific namespace.

The unknown command doesn't always raise an exception, but when it does, it passes a return code of 1 and a result value equal to invalid command name "yyz" (should the (non-existent) command yyz be invoked).

See unknown and the man page for a more thorough description of the command. The namespace unknown command is described here .

Exception handling commands

An exception that isn't handled by the program is eventually handled by a default handler which basically only does two things: 1) output an error message, and 2) stop the program. If you want to do something else, like for instance deal with the error somehow and go on, or give up and at least save important data, you need to define an exception handler. The handler will deal with exceptions from commands invoked in the body script or from commands called by those commands, and so on.

An exception handler has two parts: the catcher and the dispatcher. The catcher executes a script or body and intercepts any exceptions raised during the execution. The interpreter passes the return code, the result value, and the return options dictionary to the handler. This information can then be used to dispatch to the exact piece of code (the handler clause) that the programmer has assigned to deal with the kind of termination in question.

catch

The catch command has been the basic exception-handling command since Tcl 8.4.

The catch command performs the catching part of exception handling, but does not dispatch. If fail is a command that raises an exception, this invocation:

catch { fail }

simply prevents the exception from stopping the program but does nothing to deal with the problem. The traditional invocation looks like this:

set returnCode [catch { script } result]

In this case, after running catch the variable result contains the result value at termination of script. The result value of catch (which is stored in returnCode in this snippet) is the return code of the termination. If this return code is 0, no exception was raised and the variable result contains valid data, the return value of script. If it is greater than 0, the script was interrupted by an error or control flow operation (see Script termination: results and control for details).

This means that we can build a simplistic dispatcher using the return code from catch (in this example, only dealing with the non-zero and zero cases):

set filename foo.bar
set returnCode [catch {
    open $filename r
} result] 
if $returnCode {
    # something happened, possibly an error, $result is likely to be an error message
    puts "Ouch: $result"
} else {
    # everything is ok, $result is a channel identifier
    set f $result
}

This construct will either set the variable f to whatever open returns, or if something happens, dispatch to the code puts "Ouch: $result".

A more general dispatching construct:

switch [catch { script }] {
    0 { puts "all is well, proceed" }
    1 { puts "an exception was raised" }
    2 { puts "return was invoked" }
    3 { puts "break was invoked" }
    4 { puts "continue was invoked" }
    default { puts "some other return code was delivered" }
}

For more fine-grained dispatching while handling exceptions, information from the return options dictionary (especially the -errorcode value) can be used. (The format of the value of -errorcode is discussed below.)

switch [catch { script } result options] {
    0 { puts "all is well, proceed with the value ($result)" }
    1 {
        switch -regexp -- dict get $options "-errorcode" {
            {POSIX EACCES} {
                puts "Handling the POSIX \"permission denied\" error: $result"
            }
            POSIX {
                puts "Handling any other kind of POSIX error: $result"
            }
            default {
                puts "Handling any kind of exception at all: $result"
            }
        }
    }
    default { # other handling }
}

See catch and the man page for a more thorough description of the command.

try

The try command was added in Tcl 8.6.

The try command performs both the catching and dispatching parts of exception handling. The dispatching is defined by adding handler clauses to the command invocation:

try {
    script
} on ok {} {
    puts "all is well, proceed"
} on error {} {
    puts "handling an exception"
} on return {} {
    puts "return was invoked"
} on break {} {
    puts "break was invoked"
} on continue {} {
    puts "continue was invoked"
}

There is no way to define a default handler clause to round up any remaining modes of termination, but if error codes like, say, 42 and 1138 are expected to be returned from a script, you can add the following specific clauses:

# ...
} on 42 {} {
    puts "don't panic"
} on 1138 {} {
    puts "I've got a bad feeling about this"
# ...

It isn't necessary to define all kinds of handler clauses, indeed no handler clause at all needs to be defined. If none of the defined handler clauses get the dispatch, the exception is propagated outside the try construct.

Dispatching can also be based on the -errorcode value in the return options dictionary by using trap-style handler clauses:

try {
    script
} on ok {result} {
    puts "all is well, proceed with the value ($result)"
} trap {POSIX EACCES} {result} {
    puts "Handling the POSIX \"permission denied\" error: $result"
} trap POSIX {result} {
    puts "Handling any other kind of POSIX error: $result"
} on error {result} {
    puts "Handling any kind of exception at all: $result"
} on return {} - on break {} - on continue {} {
    # other handling
}

The handler clauses should be ordered from most specific to most generic. Note that the on error handler clause must be placed after any trap handler clauses, because it picks up any kind of error exception.

If a handler clause needs to examine the return options dictionary, it can be passed as a second variable to the handler clause:

try {
    script
} on ok {result} {
    puts "all is well, proceed with the value ($result)"
} trap {POSIX EACCES} {result options} {
    puts "Handling the POSIX \"permission denied\" error on line dict get $options "-errorline": $result"
} on error {result} {
    puts "Handling any kind of exception at all: $result"
}

You can add a finally script clause to the try construct. The code in script will be run whether an error occurs or not and regardless of which handler clause, if any, was dispatched to.

# $f is a channel identifier for an open file
try {
    # do something with $f that might fail
} on ok {result} {
    # handle success
} on error {result options} {
    # handle failure
} finally {
    close $f
}

The result value of the try construct is the result value of the handler clause that was dispatched to, or of the script if none of the handler clauses were engaged. The result value of the finally clause is never used.

See try and the man page for a more thorough description of the command.

Background error handling

Tcl/Tk automatically catches exceptions that are raised in background processing (e.g. when events are processed in an update or vwait call) and dispatches them to a programmer-defined handler procedure, if available. If such an exception handler is registered with interp bgerror (available as of Tcl 8.5) the result value of the termination (an error message) and the return options dictionary will be passed to the handler. If no such handler is registered for the active interpreter, it instead attempts to call a global command procedure named bgerror which the programmer needs to define (it doesn't exist otherwise). The bgerror command only gets one argument passed to it, an error message. If bgerror isn't available, the error message is simply displayed.

If fail is a command that raises an exception, and we run this code in a Tk-enabled console:

bind . <K> fail

and then press K, we get a dialog box pointing out that an error has occurred.

If we define the bgerror command:

proc bgerror {result} {
    puts "I'm bgerror"
}

and then press K, we get the message "I'm bgerror".

If we define this command:

proc myHandler {result options} {
    puts "I'm myHandler"
}

and then run this command:

interp bgerror {} myHandler

and then press K, we get the message "I'm myHandler".

See bgerror and the man page for a more thorough description of the command. The interp bgerror invocation is described here and background exception handling here .

The return options dictionary and other exception information sources

The return options dictionary

Whenever an exception is raised, since version 8.5 Tcl creates a dictionary of keys and values that describe the exception:

KeyUsed for/describes
-codereturn category: only 1 signifies an actual exception
-levelstack level: 0 for all exceptions
-errorinfoa brief human-readable description of the event, with an appended stack trace
-errorcodemachine-readable: either NONE or a list of code words that classify the exception; not the same as -code
-errorlinethe line where the exception occurred
-errorstackmachine-readable: an even-sized list of code words and sub-lists that describe the calling path to the point where the exception was raised

(I've heard rumours (e.g. on the man page for try) of a seventh key, -during, that's supposed to store information on an original exception if another exception occurs during the handling of the first, but I'm unable to find any real documentation for it.)

As described above, this dictionary is passed by the interpreter to an exception handler, which may pass them on to a handler clause. It is passed with all modes of termination, but the -error* keys are only available on error termination.

Errorinfo

The value of the -errorinfo member is described here . The value of the -errorinfo member is a multi-line string that consists of at least an error message and, on following lines, a stack trace that lists the command invocations between the handler and the point where the exception was raised, interspersed with lines like invoked from within or while executing. The try handler also adds a line reporting on which line in the try body the error occurred. The command invocations are shown unsubstituted, like error $msg. The first line is taken from either

  • the -errorinfo option to return, or
  • the info argument to error, or
  • the message / result argument to error, throw, or return, or
  • whatever error message a failing command produces

Errorcode

The value of the -errorcode member is described here . The value is a list of code words and data that is intended for processing by the program. When raising your own exceptions (and not, say, replicating an error generated by some other command), the single-value list NONE is generally applicable (it means "no further information available", not "no error"). You may also define your own format, in which case you should avoid beginning the list with any of the standard code words (which indicate the class of error and at the same time the format of the rest of the code value) ARITH, CHILDKILLED, CHILDSTATUS, CHILDSUSP, POSIX, or TCL. The value is taken from either

  • the -errorcode option to return, or
  • the code / type argument to error or throw, or
  • whatever error code is generated when a command fails

The CHILDKILLED and CHILDSUSP error classes include in their list the name of the signal that killed or suspended the child process; the name is one of those given in the signal.h C header file (most likely extended by signal names taken from Unix systems, for signals supported by the platform). The POSIX error class reports the symbolic name of the error as defined by the POSIX extension of the errno.h header file. Note that the level of compliance to the POSIX standard varies between platforms, with Windows being only partly compliant.

Errorstack

The value of the -errorstack member is described here . The value is an even-sized list where each pair of members represent one level on the call stack between the handler and the error. Every pair is a code word (or token) followed by a parameter list with one or more members: the members in the parameter list are the name of an invoked command or a descriptor like syntax, returnStk, returnImm, etc, followed by the parameter values as received by the command (i.e. substituted, unlike how they appear in the value of the -errorinfo member).

errorCode and errorInfo

Before Tcl 8.5, basic information on exceptions could be found in two global variables , errorCode and errorInfo. For backwards compatibility, the variables are still available. The errorCode variable contains, as might be expected, the same value as the -errorcode member of the return options dictionary for the latest error. Correspondingly, the errorInfo variable contains the same value as the -errorinfo member. The variables are in no way deprecated, but they are no longer needed: the return options dictionary should be used instead.

The info command

The invocation info errorstack returns the value contained in the -errorstack member of the return options dictionary for the latest error.

How to use all this stuff?

The infrastructure provided by Tcl allows applications to use exception handling, in the traditional sense of "try to do this, and if something goes wrong tell me and I'll see what can I do". This contrasts to the approach of "errors prediction", which, for example, performs a series of tests on the data which will be passed to a command for checking its validity, before the operation is performed. Both techniques are not excludent, however. Tcl allows various approaches to errors management, with their pros and cons:

Approach 1: return, catch and process the error

1) Always use the advanced return options when writing procedures which can cause or face errors, or which may give back an invalid result;

2) Always use catch for calling commands or your own procedures which can cause or face errors like described in 1;

3) Create a procedure to be called in the case that catch captures an error, for interpreting the error codes and, based on that, show error messages in friendly and standardized dialogs and perform operations which could minimize or solve the error.

Approach 2: tracing ::errorCode

Create a trace on ::errorCode, and a procedure to be called everytime it is modified, for interpreting the codes, display them, provide minimization measures, etc.

Any other? Please add what you do!

LV One useful thing that I sometimes use is creation of log files containing information intended to be useful in determining the state of the program during particular points. Sometimes, displaying information about the values of a number of variables is not as helpful as having that information written to a file - for instance, there are times when a GUI application might not have easy access to stderr for error traces. Writing information to a log file, which is available - and perhaps even emailable - to the programmer responsible is helpful.

Which errors shall be told to the user?

Failure in files, channels and sockets operations?

Errors caused by invalid inputs. It is often useful to use a distinct error code (e.g., INVALID) for data validation errors, as it makes it possible for the application to distinguish between errors in the user's input and errors in the validation or execution code.

Which errors shall NOT be told to the user?

Syntax errors and programming bugs - They'd better be fixed. Sure, but....

LV Certainly they need to be fixed. However, if you hide the info from the user, how will the programmer know what the bug/error is? Unless you have a guaranteed method of getting said info to the programmer (and email doesn't count - the user MIGHT be working off line), then providing the user with sufficent information to a) know what the error is and b) know who to contact or what to do about the problem seems the best approach to me.

Fabricio Rocha - 12-Feb-2010 - One more reason for having a way to intercept and explain this kind of errors to common users is that it seems that any test suite or any test routine will not be able to find some errors that users are able to find. Of course it is not nice to show weaknesses to a final user, but this is something practically unavoidable in software. And in addition to the situations listed by LV, we can consider that, for an open source/free software, providing good information about an error is a way to c) allow a user with sufficient programming knowledge to fix the problem and possibly contribute to the software development.

See also