Version 17 of TcLeo

Updated 2004-11-09 19:02:14 by lwv

by Reinhard Max

This little script sends its command line arguments as a query to the online dictionary at http://dict.leo.org and writes the parsed result to stdout. It uses Tcl's http package and the htmlparse and ncgi packages from Tcllib.

The scraper part (everything inside the ::dict.leo.org namespace) could also be included from other frontends. It's [query] proc takes a list of words to search for, and returns a list of english/german pairs that matched the query.


 package require http
 package require htmlparse
 package require ncgi
 namespace eval ::dict.leo.org {
    variable table ""
    proc parse {tag close options body} {
        variable TD
        variable table
        switch -- $close$tag {
            TD     {set TD ""}
            /TD    {if {[llength $TD]} {lappend table [string trim $TD]}}
            default {append TD [string map {  { }} $body]}
        }
    }
    proc query {query} {
        variable table
        set url "http://dict.leo.org/?search=[::ncgi::encode $query]"
        set tok [::http::geturl $url]
        foreach line [split [::http::data $tok] "\n"] {
            if {[string match "*search results*" $line]} break
        }
        ::http::cleanup $tok
        set table ""
        ::htmlparse::parse -cmd ::dict.leo.org::parse $line
        return $table
    }
 }
 proc max {a b} {expr {$a > $b ? $a : $b}}
 proc main {argv} {
    set table [dict.leo.org::query [join $argv]]
    set max 0
    foreach c $table {set max [max $max [string length $c]]}
    set sep [string repeat = $max]
    set table [linsert $table 0 " English" " Deutsch" $sep $sep]
    foreach {c1 c2} $table {
        puts [format "%-*s  %-*s" $max $c1 $max $c2]
    }
    puts ""
 }
 main $argv

RS: Proud owners of a firewall might have to add a line like

    http::config -proxyhost proxy -proxyport 80

at the very top of proc query. Helped in my case to really get out.


Synox: It doesn't work with at the moment.... I think this is beacause they updated the site? Is there a 'update' available?


Web scraping | Using Tcl to write WWW client side applications