dos2unix

GPS @ April 25 2003 - A dos2unix script is usually used to convert the end of line characters in text files from DOS/Windows format to a Unix-like format. Tcl can do automatic conversion of line endings, but I also need the conversion done in Windows, because I use xedit in Cygwin XFree86, so I came up with this short script.

 #!/bin/tclsh8.4

 foreach f $::argv {
        puts ${f}...
        set fd [open $f r]
        fconfigure $fd -translation binary
        set data [read $fd]
        close $fd

        set data [string map {"\r\n" "\n"} $data]

        set fd [open $f w]
        fconfigure $fd -translation binary
        puts -nonewline $fd $data
        close $fd
 }

There are also ways of doing this with the tr utility, sed, and probably others...

KPV Below is a generalization of the above script to automatically convert from Unix, Macintosh or Windows lineend format into the native one.

 #!/bin/tclsh8.4

 if {$tcl_platform(platform) == "windows"} {
    set eol "\r\n"
 } elseif {$tcl_platform(platform) == "unix"} {
    set eol "\n"
 } else {
    set eol "\r"
 }
 set eolmap [list "\r\n" $eol "\n" $eol "\r" $eol]

 foreach f $::argv {
    puts ${f}...
    set fd [open $f r]
    fconfigure $fd -translation binary
    set data [read $fd]
    close $fd

    set data [string map $eolmap $data]

    set fd [open $f w]
    fconfigure $fd -translation binary
    puts -nonewline $fd $data
    close $fd
 }

How about adding something like "if ![string is -ascii $data] {continue}" in the foreach loop? Then you'd have something like Sun's stripcr program; you could turn it loose on a whole directory tree and it would skip over the binaries and only convert the text files. The foreach statement could be: foreach f [find . "file isfile"] (assuming you have tcllib).


male - 26.04.2003:

Sorry, but why using the escape codes of carriage return, line feed?

Why not using the translation capability of tcl using fconfigure?

Why not just simple:

 proc convertNewLine {platform args} {
   switch -exact -- $platform {
     apple   {set translation cr;}
     auto    {set translation auto;}
     windows {set translation crlf;}
     unix    {set translation lf;}
     default {error "bad platform \"$platform\": must be apple, auto, windows, or unix";}
   }

   foreach fileName $args {
     if {[catch {set fid [open $fileName r];} reason]} {
       error $reason;
     }
     set data [read $fid [file size $fileName]];
     close $fid;

     if {[catch {set fid [open $fileName w];} reason]} {
       error $reason;
     }
     fconfigure $fid -translation $translation;
     puts -nonewline $fid $data;
     close $fid;
   }

   return;
 }

2003-04-27 RS: Even better - since Tcl accepts all usual line terminations on input, and uses the native on output, the following should suffice:

 proc normalizeNewlines fn {
    set    fp [open $fn]
    set  data [read $fp]
    close $fp
    set    fp [open $fn w]
    puts  $fp $data
    close $fp
 }

cant we just fcopy

 proc dos2unix {in out} {
     fcopy [set a [open $in]] [set b [open $out w]]
     close $a
     close $b
 }

RS: In principle yes, except that in all previous code the written file should have the same name as the read file. By using a different name for out, and finally renaming that (and deleting the old in), the same effect could be achieved.