Version 20 of fileutil

Updated 2006-07-24 13:00:26

Documentation can be found at http://tcllib.sourceforge.net/doc/fileutil.html


Currently the fileutil package contains a find, grep, and cat. Other procs that would be useful to add would include wc, tee, head, tail, and perhaps some awk'ish type functions ala Tclx.

For more on grep, see "NR-grep: A Fast and Flexible Pattern Matching Tool" [L1 ].


Perhaps even some code like Glenn Jackman's:

    proc touch {filename {time ""}} {
        if {[string length $time] == 0} {set time [clock seconds]}
        file mtime $filename $time
        file atime $filename $time
    }

glennj: This proc has been accepted into tcllib 1.2: http://tcllib.sourceforge.net/doc/fileutil.html

US Unix-like touch:

    proc touch {filename {time ""}} {
        if {![file exists $filename]} {
           close [open $filename a]
           }
        if {[string length $time] == 0} {set time [clock seconds]}
        file mtime $filename $time
        file atime $filename $time
    }

What other file-related procs would be useful?


2003-11-28 VI Nice of you to ask. There's a list above, other than that: tail -f, split, join. I use tkcon as my main shell on a wimpy laptop. Fewer dlls loaded is good..


2003-12-16 SS Trying to improve over the Tcl implementation of wc in the Great Language Shootout I wrote this, that seems half in execution time against big files:

 set text [read stdin]
 set c [string length $text]
 set l [expr {[llength [split $text "\n\r"]]-1}]
 set T [split $text "\n\r\t "]
 set w [expr {[llength $T]-[llength [lsearch -all -exact $T {}]]-1}]
 puts "\t$l\t$w\t$c"

Output seems to be identical to GNU's wc command.


SEH 20060723 -- The proc fileutil::find is useful, but it has several deficiencies:

  • On Windows, hidden files are mishandled.
  • On Windows, checks to avoid infinite loops due to nested symbolic links are not done.
  • On Unix, nested loop checking requires a "file stat" of each file/dir encountered, a significant performance hit.
  • The proc calls itself recursively, and thus risks running into interp recursion limits for very large systems.
  • fileutil.tcl contains three separate instantiations of proc find for varying os's/versions. Maintenance nightmare.

The following code eliminates all the above deficiencies. It checks for nested symbolic links in a platform-independent way, and scans directory hierarchies without recursion.

For speed and simplicity, it takes advantage of glob's ability to use multiple patterns to scan deeply into a directory structure in a single command, hence the name. Its calling syntax is the same as fileutil::find, so with a name change it could be used as a drop-in replacement:

 proc globfind {{basedir .} {filtercmd {}}} {
    set depth 16
    set filt [string length $filtercmd]
    set basedir [file normalize $basedir]
    file stat $basedir fs
    set linkName $basedir
    while {$fs(type) == "link"} {
        if [catch {
            file stat [set linkName [file normalize [file link $linkName]]] fs
        }] {break}
    }
    if {$fs(type) == "file"} {
        set filename $basedir
        if {!$filt || [uplevel $filtercmd [list $filename]]} {
            return $filename
        }
    }
    set globPatternTotal {}
    set globPattern *
    set incrPattern /*
    for {set i 0} {$i < $depth} {incr i} {
        lappend globPatternTotal $globPattern
        append globPattern $incrPattern
    }

    lappend checkDirs $basedir
    set returnFiles {}
    set redo 0
    set terminate 0
    set hidden {}
    while {!$terminate} {
        set currentDir [lindex $checkDirs 0]
        if !$redo {
            set allFiles [eval glob\
                              -directory [list $currentDir]\
                              -nocomplain $hidden $globPatternTotal]
        }
        set redo 0
        set termFile [lindex $allFiles end]
        set termFile [lrange [file split $termFile]\
                          [llength [file split $currentDir]] end]
        if {$hidden != {}} {
            set checkDirs [lrange $checkDirs 1 end]
        }
        foreach test {checkdirs length duplicate recursion prune} {
            switch $test {
                checkdirs {
                    set afIndex [llength $allFiles]
                    incr afIndex -1
                    for {set i $afIndex} {$i >= 0} {incr i -1} {
                        set cdir [lindex $allFiles $i]
                        if {[llength\
                                 [lrange [file split $cdir]\
                                      [llength [file split $currentDir]]\
                                      end]]
                            < $depth} {
                            break
                        }
                        file stat $cdir fs
                        set linkName $cdir
                        while {$fs(type) == "link"} {
                            if [catch {
                                file stat\
                                    [set linkName\
                                         [file normalize\
                                              [file link $linkName]]] fs
                            }] {
                                break
                            }
                        }
                        if {$fs(type) == "directory"} {
                            lappend checkDirs $cdir
                        }
                    }
                }                                        
                length {
                    if {[llength $termFile] < $depth} {
                        break
                    }
                }
                duplicate {
                    set recurseTest 0
                    set dupFile [lindex $allFiles end]
                    set dupFile [lrange [file split $dupFile]\
                                     [llength [file split $basedir]]\
                                     end]
                    set dupFileEndDir [expr [llength $dupFile] - 2]
                    if {[lsearch $dupFile [lindex $dupFile end-1]]
                        < $dupFileEndDir} {
                        set recurseTest 1
                    }
                }
                recursion {
                    if !$recurseTest {continue}
                    if {($hidden == {})} {
                        set type "-types l"
                    } else {
                        set type "-types [list "hidden l"]"
                    }

                    set linkFiles {}
                    set linkDir $currentDir
                    while 1 {
                        set linkFiles\
                            [concat $linkFiles\
                                 [eval glob\
                                      -directory [list $linkDir]\
                                      -nocomplain $type $globPatternTotal]]
                        if {$linkDir == $basedir} {break}
                        set linkDir [file dirname $linkDir]
                    }
                    array unset links
                    set linkFiles [lsort -unique $linkFiles]
                    foreach lf $linkFiles {
                        set ltarget [file normalize [file readlink $lf]]
                        if {[array names links -exact $ltarget] != {}} {
                            lappend pruneLinks $lf
                            set redo 1
                        }
                        array set links "$ltarget $lf"
                    }
                }
                prune {
                    if ![info exists pruneLinks] {continue}
                    set afIndex [llength $allFiles]
                    incr afIndex -1
                    set cdIndex [llength $checkDirs]
                    incr cdIndex -1
                    set rfIndex [llength $returnFiles]
                    incr rfIndex -1
                    foreach pl $pruneLinks {
                        for {set i $afIndex} {$i >= 0} {incr i -1} {
                            set af [lindex $allFiles $i]
                            if ![string first $pl/ $af] {
                                set allFiles [lreplace $allFiles $i $i]
                            }
                        }
                        for {set i $cdIndex} {$i >= 0} {incr i -1} {
                            set cd [lindex $checkDirs $i]
                            if ![string first $pl/ $cd] {
                                set checkDirs [lreplace $checkDirs $i $i]
                            }
                        }
                        for {set i $rfIndex} {$i >= 0} {incr i -1} {
                            set rf [lindex $returnFiles $i]
                            if ![string first $pl/ $rf] {
                                set returnFiles [lreplace $returnFiles $i $i]
                            }
                        }
                    }
                    unset pruneLinks
                }
                default {}
            }
        }
        if $redo continue
        if {$hidden == {}} {
            set hidden "-types hidden"
        } else {
            set hidden {}
            if {[llength $checkDirs] == 0} {set terminate 1}
        }
        set returnFiles [concat $returnFiles $allFiles]
    }
    set filterFiles {}
    foreach filename [lsort -unique $returnFiles] {
        if {!$filt || [uplevel $filtercmd [list $filename]]} {
            lappend filterFiles $filename
        }
    }
    return $filterFiles
 }

Category Package, subset Tcllib, Category File