How can I run data through an external filter?

JBR : I'd like to have a simple reliable way to push a perhaps largish amount of data to an external filter (pipeline) and read it back. The obvious code here doesn't work because gets doesn't re-enter the event loop to allow data to be pushed to the pipeline.

 set fp [open "| cat" RDWR]

 proc write {} {
    puts $::fp "Hi"
 }

 fileevent $fp writable { write }

 while { [gets $fp line] >= 0 } {
    puts $line
 }

Is there a simple incantation to wrap the gets with, allow non-blocking IO and re-entering the event loop, reliably returning the filtered data?


JBR 2009-11-17 : Here is a syntactically palatable solution if you'll give me 8.6. NEM's suggestion below put me on the track that both the reader and writer should be in fileevents. It would be stronger if "reader", "writer" and "pipe-done" were anonymous and if puts and gets could be wrapped unobtrusively. But I like this :

AMG: apply lets you use anonymous lambdas instead of named procs.

JBR 2009-11-17 : I've come back and modified the original code to make use of the unique $pipe handle name in the proc and global variable names. This removes the most blatant name collisions, leaving only the IO wrapping.

 proc pipe-puts { args } {
    puts {*}$args
    yield
 }

 proc pipe-gets { args } {
    if { [llength $args] == 2 } {
        upvar [lindex $args 1] [lindex $args 1]
    }
    if { [set n [gets {*}$args]] == -1 && ![eof [lindex $args 0]] } {
        yield
    }

    set n
 }

 proc pipe-filter { writer command reader } {
    set pipe [open |$command RDWR]

    fconfigure $pipe -blocking 0

    set writer "yield\n$writer\nclose $pipe write"
    set reader "yield\n$reader\nclose $pipe read\nset ::pipe-done-$pipe 1"

    coroutine writer-$pipe apply [list pipe $writer] $pipe
    coroutine reader-$pipe apply [list pipe $reader] $pipe

    fileevent $pipe writable writer-$pipe
    fileevent $pipe readable reader-$pipe

    vwait pipe-done-$pipe
 }

 pipe-filter {
    foreach i { 1 2 3 4 5 } {
        pipe-puts $pipe $i
    }
 } {
    cat
 } {
    while { [set n [pipe-gets $pipe line]] >= 0 } {
        puts "$n $line"
    }
 }

This code is nice but the input and output data must reside in a specific globally accessible name space. Because of the use of coroutines, it cannot be proc local. In addition the IO cannot take advantage of library code that uses puts/gets to write/read data because the IO is non-blocking and the routines must yield to allow cooperation.


JBR 2009-11-20 : Here are some redefinitions of puts/gets that allow them to work in the coroutine context needed above. fconfigure -blocking is used to determine if we need to yield or not. This may interfere with other uses of non-blocking IO. I think this makes using UNIX filters inline with Tcl very nice indeed:

 rename puts _puts
 proc puts { args } {
    _puts {*}$args

    switch [llength $args] {
        2 { if { [lindex $args 0] ne "-nonewline"                       \ 
              && [fconfigure [lindex $args 0] -blocking] == 0 } { yield } }
        3 { if { [fconfigure [lindex $args 1] -blocking] == 0 } { yield } }
    }
 }

 rename gets _gets
 proc gets { args } {
    if { [llength $args] == 2 } {
        upvar [lindex $args 1] [lindex $args 1]
    }

    while { [set n [_gets {*}$args]] < 0 } {
        if { [fconfigure [lindex $args 0] -blocking] == 1               \
          || [eof [lindex $args 0]] } {
            break
        }

        yield
    }

    set n
 }

NEM Why not use fileevent (chan event in 8.5+) for this side too?

proc read_data fp {
    if {[gets $fp line] >= 0} {
        puts $line
    } else {
        if {[eof $fp]} { close $fp }
    }
}
chan configure $fp -blocking 0
chan event $fp readable [list read_data $fp]

Another idea that I've used when in python is to create a thread and write the data from a dedicated "writer" thread, reading the result back into the mainline thread. This is harder in Tcl because the two threads will be in different interpreters and therefore won't share the data. Is there a simple (not copying) way for interps to share data?


AMG points out on the filter page that this is a real shortcoming - can this be remedied with a simple scripted solution?

AMG: Probably this is in reference to my comment on lack of ability to close only one direction of a read/write channel. TIP 332 [L1 ] has corrected this deficiency.


Thanks jbr.