Updated 2016-11-15 08:15:54 by APN

Execute in Parallel and Wait

fileevent is often recommended for execution of multiple processes in parallel, but it requires a script to be written in an event-driven style. Sometimes what is wanted is to execute multiple processes which run in parallel, and wait for them all to complete. Here is an example of that:
#! /bin/env tclsh

proc main {} {
    for {set i 0} {$i < 5} {incr i} {
        set script {
            #simulate a hard-working process
            for {set i 0} {$i < 10000000} {incr i} {
            }
            puts [list hello from [pid]]
        }
        set chan [open |[list [info nameofexecutable] <<$script 2>@stderr]] 
        dict set res $chan command $script
        fconfigure $chan -blocking 0
        lappend background $chan
    }
    while 1 {
        foreach chan $background {
            if {[eof $chan]} {
                fconfigure $chan -blocking 1
                if {[set idx [lsearch -exact $background $chan]] >= 0} {
                    set background [lreplace $background $idx $idx]
                }
                catch [close $chan] cres copts
                dict set res $chan result $cres
                dict set res $chan options $copts
            } else {
                puts -nonewline [read $chan]
            }
        }
        if {[llength $background] == 0} {
            break
        }
        after 100
    }
    return $res
}

#puts [main]
main

In contrast with some other solutions, TclX's wait, this example does not require any extension, and should work on various platforms.

Windows users, who typically start a script by double-clicking on the icon, running it in wish, should note that spawning the scripts might be problematic, for reasons discussed under the heading "Windows Console" on the exec page. Two possible solutions are

  • Run the program in tclsh rather than a GUI-based shell
  • Remove the 2>@stderr argument from the list creating the argument to the open command

[ethouris] - 2014-12-20 14:24:54

Well, this is such a great idea, that it can be even made a framework!

This is a set of two functions: "ppspawn" will execute multiple commands and will return a dictionary where keys are results of open command (plus special key "tracer" which is the name of the global tracer variable). Then you can trace each command individually, if you want, or use the second function, "pptrace", which will trace all commands one after another until they all finish. From every command spawned there will be read as many lines as currently available. The "pptrace" requires the same dictionary as created by "ppspawn" (although you can tweak it before passing to "pptrace", you just need to take care that keys are only descriptors of command spawns plus "tracer" which defines the name of the tracer variable).

The "pptrace" prints all lines from a process at the time of their availability, prepending them by [DESCRIPTOR]. After a single review of all processes it stops at the variable with the name defined under "tracer" key so that it doesn't read uselessly. This is unblocked when the variable is modified, and the variable is modified whenever any of the processes reports "readable" fileevent. Like in the parent example, finished processes are removed from the list, and after all processes are finished the command "pptrace" ends.

The "pptrace" gets optionally a lambda that is run at the "vblank" time (all processes were reviewed and the tracer is going to stop on the tracer variable), here it's blocked by a comment so that you can see "time wasted" on unnecessary reads.

This code has been created during the development of Tclmake tool, which is a part of Silvercat Tcl-based build system. Actually this is needed to provide the "parallel build" feature (although this below code is not to be the part of it - it's only the first "proof of concept").
#!/usr/bin/tclsh

proc ppspawn {args} {
    set res ""

    # This is lamah, but I really don't have much choice :)
    set tracername "::S"

    foreach cmd $args {
        set cc [open "|$cmd 2>@stderr"] 
        dict set res $cc command $cmd
        fconfigure $cc -blocking 0 -buffering line
        # cut off the "file" prefix, should it be any other word
        set id [lindex [regexp -inline {[a-zA-Z]*([0-9]+)} $cc] 1]
        if { $id == "" } {
            set id $cc
        }
        append tracername ".$id"
    }

    set all [dict keys $res]
    foreach cc $all {
        fileevent $cc readable "set $tracername $cc"
    }
    dict set res tracer $tracername

    return $res
}

proc pptrace {res {vblank {}}} {

    set tracername [dict get $res tracer]
    set res [dict remove $res tracer]
    set background [dict keys $res]

    while 1 {
        foreach cc $background {
            if {[eof $cc]} {
                if {[set idx [lsearch -exact $background $cc]] >= 0} {
                    set background [lreplace $background $idx $idx]
                }
                catch [close $cc] cres copts
                dict set res $cc result $cres
                dict set res $cc options $copts
            } else {
                # Roll until EOF or EAGAIN.
                # When EAGAIN, it will be retried at the next roll.
                # When EOF, it will be removed from the list at the next roll.
                while { [gets $cc linein] != -1 } {
                    puts "\[$cc\] $linein"
                }
            }
        }

        if {[llength $background] == 0} {
            # All processes finished.
            break
        }
        if { $vblank != "" } {
            apply $vblank $background
        }

        # Ok, now clear the variable
        set $tracername ""
        vwait $tracername
        #puts "UNBLOCKED BY: [set $tracername]"
   }
   return $res
}

set spn [ppspawn {*}$argv]
puts "Started processes:"
dict for {cc data} $spn {
    if { $cc != "tracer" } {
        puts "\[$cc\] [dict get $data command]"
    }
}
set bkp [dict keys $spn]
puts "TRACING: $bkp"
set spn [pptrace $spn]  ;# {bk {puts "--- still running: $bk"}}]
puts "RESULTS:"
dict for {cc data} $spn {
    puts "\[$cc\] [dict get $data command]"
    puts "RESULT: [dict get $data result]  OPTIONS: [dict get $data options]"
}

APN A generalized form of this is the Promise abstraction in other languages like Javascript, C#, Scala etc. A Tcl implementation is available in the promise package.