proc file:uread {fn} {
set encoding ""
set f [open $fn r]
if {[info tclversion]>=8.1} {
gets $f line
if {[regexp \xFE\xFF $line]||[regexp \xFF\xFE $line]} {
fconfigure $f -encoding unicode
set encoding unicode
}
seek $f 0 start ;# rewind -- real reading is still to come
}
set text [read $f [file size $fn]]
close $f
if {$encoding=="unicode"} {
regsub -all "\uFEFF|\uFFFE" $text "" text
}
return $text
}Works both on ASCII and Unicode files (not on swapped bytes tho... FFFE seems to be handled in code, but swapping is not yet ;-(. See also: Unicode and UTF-8Frank Pilhofer contributed the following swapper that operates on a string data that might be a whole Unicode file, in the comp.lang.tcl newsgroup:Fortunately, swapping is pretty easy in Tcl, at least in LOC:
private method wordswap {data} {
binary scan $data s* elements
return [binary format S* $elements]
}jima I think it is better to use:binary scan $data c* elementscan any expert try my(jima) point?So I'm now using the following code for reading:
global tcl_platform
if {[binary scan $data S bom] == 1} {
if {$bom == -257} {
if {$tcl_platform(byteOrder) == "littleEndian"} {
set data [wordswap [string range $data 2 end]]
} else {
set data [string range $data 2 end]
}
} elseif {$bom == -2} {
if {$tcl_platform(byteOrder) == "littleEndian"} {
set data [string range $data 2 end]
} else {
set data [wordswap [string range $data 2 end]]
}
} elseif {$tcl_platform(byteOrder) == "littleEndian"} {
set data [wordswap $data]
}
}Slightly off-topic note: The code example above tests for the Tcl version with
if {[info tclversion] >= 8.1} ...A better way of testing that is to use: if {[package vcompare [package provide Tcl] 8.1] >= 0} ...That will continue working if Tcl releases are ever labeled with version numbers more than two levels deep, or if/when a minor release > 9 is released.Donald PorterSure. I admit yours is The Right Way ;-) -- only it's about double as long as mine... Maybe I'm pampered, but I've grown to expect it could be done even simpler, so that frequent constructs are nicely wrapped:
proc version {"of" pkg op vers} {
expr [package vcompare [package provide $pkg] $vers] $op 0
}Then we can write this sugar: (cf Salt and Sugar) if [version of Tcl >= 8.1] {...RSSee also Unicode and UTF-8
