EBCDIC

Extended BCD Instruction Code. IBM's (suite of) character encoding(s). If you're not dealing with IBM systems or legacy data, it is almost certainly better to use one of the encodings that have evolved from ASCII instead.

Interestingly enough, EBCDIC was developed after ASCII; while IBM participated in the standardization of ASCII, it never implemented the code widely. A large part of the motivation was that EBCDIC was designed for easy encoding on Hollerith cards; using ASCII required either a non-standard card code (shown in http://www.cwi.nl/~dik/english/codes/80col.html#decascii ) or else a fair amount of hardware (a large hand-soldered diode matrix) to do code conversion.

The fact that the alphabet is disconnected in EBCDIC comes from the fact that it's tied to card code. The most significant nybble of an EBCDIC character represents the zone punches (12, 11, 0) and the least significant one represents the numeric punches (1-9). There's a weird special case surrounding zero, blank, and null, and several other curious cases having to do with vagaries of the 029 key punch. http://www.cwi.nl/~dik/english/codes/stand.html#ebcdic has more.

RS: Here is a list from Oracle's SQL Language Reference Manual:

 Decimal  Symbol  Decimal  Symbol  
 64       blank   108      %  
 74    cent sign* 109      _  
 75       .       110      >  
 76       <       111      ?  
 77       (       122      :  
 78       +       123      #  
 79       |       124      @  
 80       &       125      '  
 90       !       126      =  
 91       $       127      "  
 92       *       129..137 a - i  
 93       )       145..153 j - r  
 94       ;       162..169 s - z  
 95    NOT sign*  193..201 A - I  
 96       -       209..217 J - R  
 97       /       226..233 S - Z  
 107      ,       240..249 0 - 9  
 *: characters not in ASCII. NOT sign is a dash with short hook down.

Does anyone know of Tcl code that converts text strings between ASCII and EBCDIC? RS: sure, encoding:

 % encoding convertto ebcdic EXAMPLE
 ÅçÁÔ×ÓÅ

What about the opposite? I tried:

 set ebcdic [encoding convertto ebcdic "This is a test"]
 set as [encoding convertto ascii $ebcdic]

doesn't end up with the original text.

RS: Pilot error - the standard encoding is always utf-8; in conversions you have to specify the non-utf-8 part. This works:

 % encoding convertfrom ebcdic [encoding convertto ebcdic ASCII]
 ASCII

Ah - that explains it. Thanks!


AMG: Does Tcl compile and run on EBCDIC machines, or does it have a hard ASCII requirement? Since Tcl's built-in string encoding is always UTF-8, I imagine compiling on EBCDIC would be problematic because of C character constants. If I'm not mistaken, the C expression 'A' evaluates to 0xc1 on EBCDIC, yet would be encoded 0x41 in UTF-8, so on EBCDIC, checking if a Tcl string byte equals 'A' would not succeed in checking if it is a capital A character.

I'm asking because I want to know if the fact that I'm coding a C extension for Tcl automatically disqualifies my code from ever running on EBCDIC.