Updated 2008-07-18 09:43:07 by lars_h

Richard Suchenwirth 2001-05-31 - Robin Lauren <robin@lauren.net> wrote in the comp.lang.tcl newsgroup:

I want to split an argument which contains spaces within quotes into proper name=value pairs. But I can't :)

Consider this example:
  set tag {body type="text/plain" title="This is my body"}
  set element [lindex $tag 0]
  set attributes [lrange $tag 1 end] ;# *BZZT!* Wrong answer!

My attributes becomes the list {type="text/plain"} {title="This} {is} {my} {body"} (perhaps even with the optional backslash before the quotes), which isn't really what i had in mind.

How's this try? It first splits on doublequotes, so you get an alternating list of nonquoted and quoted contents; then it collects words from the nonquoted, and if that ends in a "=", appends (and re-quotes) the following quoted section:
 set result {}
 foreach {out in} [split $tag \"] {
	if {$out==""} break
	foreach i $out {
	   if [regexp =$ $i] {set i $i\"$in\"}
	   lappend result $i
	}
 }

 % set result
 body type=\"text/plain\" {title="This is my body"}
 set attributes [lrange $result 1 end]

Don't worry about the backslashed quotes in the type attribute. They are not really there... To further use the attribute tags and values, just split again on =.

MG offers a regexp alternative:
  % set string {type="text/plain" title="This is my body"}
  % set result [regexp -all -inline {[^ =]+=(?:\S+|"[^"]+")} $string]
  type=\"text/plain\" {title="This is my body"}

That will also match single-word vars which aren't in quotes (short=foo long="foo bar"). It currently dies on empty strings (short= or long="") but just replace the +'s with *'s to make those acceptable.

AM Also see: Splitting a string on arbitrary substrings

split - Arts and crafts of Tcl-Tk programming