Mbox to MH directory conversion tool

This is a really simple tool, which stores each message of the mbox file to the specified MH directory. It's mostly a sample of use the rfc822::headers package, which can be found in Reading and parsing RFC 822 headers.

Usage
mb2mh directory file

On startup, the program searches for a file (either regular file, directory, or whatever) in specified directory which name is numeric, and whose number is the highest. The messages contained in the mbox file are assigned numbers starting from one found plus one.

The program supports two conventions on terminating the message. First, if the header of a message contains the Lines field, the body is presumed to be exactly this number of lines in size. If such field is missing, line matching the pattern "From *" after an empty line is considered a start of the new message. In this case, such an empty line is stored in the output file, too.

Feel free to improve the code to suit your own needs. I think, reimplementation of mb2md [L1 ] tool in pure Tcl could be interesting.

SEH 20060115 -- Yes, this code means that you could package up tkbiff and exmh into a complete pure-Tcl mail solution.

 ### mb2mh.tcl --- Sample implementation of MH inc(1)-like tool  -*- Tcl -*-
 ## $Id: 15276,v 1.3 2006-01-16 07:00:41 jcw Exp $
 ## The following code is in public domain.

 ## NB: because this file is just an example, it lacks a number of
 ##     consistency checks.  USE WITH CARE!

 ## the next line restarts using tclsh \
 exec tclsh8.4 "$0" "$@"

 ### Code:

 package require rfc822::headers
 namespace import ::rfc-822::header-parse-line

 ## handle command line arguments
 if { [ llength $argv ] != 2 } {
     puts stderr "Usage: mb2mh MH-DIRECTORY MBOX-FILE"
     exit 1
 }
 foreach { directory mbox } \
     $argv \
     break

 ## open mbox
 set input [ open $mbox r ]

 ## obtain last message number in the MH directory
 set last 0
 foreach file [ glob -directory $directory -nocomplain "\[0-9\]*" ] {
     set tail [ file tail $file ]
     if { ! [ regexp -- "\[^0-9\]" $tail ]
          && $tail > $last } {
         set last $tail
     }
 }

 ## read the messages, storing them in the MH directory
 set may-be-new? 1
 set must-be-new? 1
 set file ""
 while { ! [ eof $input ] } {
     if { [ gets $input line ] < 0 } { break }
     set got-from?  [ string match "From *" $line ]
     set got-empty? [ string equal "" $line ]
     if { ${must-be-new?} && ${got-empty?} } {
         puts stderr "ignoring an empty line before the message"
         continue
     } elseif { ${must-be-new?}
                || ${may-be-new?} && ${got-from?} } {
         ## New message
         set may-be-new? 0
         set must-be-new? 0
         set in-header? 1
         set lines -1

         if { [ string length $file ] } {
             puts stdout [ format "%d: done" $last ]
             close $output
         }
         set file [ file join $directory [ incr last ] ]
         set output [ open $file { WRONLY CREAT EXCL } ]

         array unset header
         if { ! ${got-from?} } {
             header-parse-line header $line
         }
     } elseif { ${in-header?} 
                && [ header-parse-line header $line ] } {
         set in-header? 0
         set lines [ expr { [ info exists header(Lines) ]
                            ? [ string trim \
                                    [ lindex $header(Lines) 0 ] ]
                            : -1 } ]
     } elseif { $lines >= 0
                && [ incr lines -1 ] < 0 } {
         set must-be-new? 1
         ## NB: ignore current line
         if { ${got-empty?} } {
             continue
         }
     } elseif { ${got-empty?} } {
         set may-be-new?  1
     }
     puts $output $line
 }
 if { [ string length $file ] } {
     puts stdout [ format "%d: done" $last ]
     close $output
 }

 ### Emacs stuff
 ## Local variables:
 ## fill-column: 72
 ## indent-tabs-mode: nil
 ## ispell-local-dictionary: "english"
 ## End:
 ### mb2mh.tcl ends here