**A real problem**is about issues such as comparing floating point numbers for equality.

## See Also edit

- Doing arithmetic on currency
- The trouble with using floating point for currency is that Many problems that people have with floating point arithmetic in Tcl
*and other languages*are associated with Doing arithmetic on currency. The trouble with using floating point for currency is that floating-point arithmetic deals with*binary*fractions, while bankers are used to dealing with*fixed*-point*decimal*fractions - which are better modeled as integers.

## Description edit

KBK 2001-02-20:Comparing two variables with double values (as FORTRAN was my mother tongue, I still associate REAL with floating points ;-) may show inequality even if the string representations are the same, as Christian Heide Damm pointed out in comp.lang.tcl:*Sometimes tcl says that two numbers are not the same, even when they look the same. I guess it's because tcl has more (internal) information about the numbers than it shows. Is this true? It makes programming a bit tricky, since you never know the exact value of a variable. Any work-arounds?*I suppose always using [string equal] instead of == is more bullet-proof, but it again slightly bloats the code -- RS

*I have attached a small example to illustrate the problem.*

# In modern versions of Tcl, it's necessary to jimmy tcl_precision to # illustrate the phenomenon set tcl_precision 12 set a [expr 2.0/3] set b [expr 2.0/3 * 1.000000000001] puts "a=$a" puts "b=$b" if {$a==$b} { puts "a and b are equal" } else { puts "a and b are NOT equal!" } puts "\nNow, let's \"convert\" a and b to strings"

*# rather, make a list representation, thus throwing away the double rep. The string rep is always there, or is regenerated -- RS*

set dummy [string length $a] set dummy [string length $b] if {$a==$b} { puts "a and b are equal" } else { puts "a and b are NOT equal!" }Another note: You always get inequality (but also unexpected binarization noise) if you set tcl_precision to the maximum value of 17 - RS

a=0.66666666666666663 b=0.66666666666733332At tcl_precision 12, it was

a=0.666666666667 b=0.666666666667

*DKF:*The problem is that floating point numbers are only ever known imprecisely, and testing for equality between floats is not a good idea at all in

*any*language, and not just Tcl. The best you can do is to see if the difference between two numbers is less than a threshold value. For example:

# Epsilon is an optional parameter which should be chosen to fit with # the scale of the numbers being compared. Here I pick a value that # works well with a tcl_precision of 12 and values whose exponent is # near zero... proc withinEpsilon {x y {epsilon 1e-12}} { return [expr {abs($x-$y) < $epsilon}] } # Using the values from above if {[withinEpsilon $a $b]} { puts "a and b are equal" } else { puts "a and b are NOT equal!" }This is easy enough, except for determining what

*epsilon*should be. Well, the answer sort-of lies above -

*epsilon*should be about

*1e[deriveEpsilonMagnitude $x $y]*where

*deriveEpsilonMagnitude*is defined below - except that you don't really want to use that (you might be working with numbers from (1,100000) and it is a good idea to have a constant concept of comparison because your conceptual model of the scale of the number is constant over the range of values it can take,) but rather work with the maximal log of the range of values

*$x*can take (ditto for

*$y*.) Which is kind-of tricky to determine automatically in a script; it is one of those things that requires either programmer thought or a sophisticated automated reasoning engine, with programmer thought being easier to come by!

proc deriveEpsilonMagnitude {x y} { # Work out average magnitude of x and y and subtract the current # size of Tcl's formatting precision... expr {int( (log10(abs($x)) + log10(abs($y))) / 2) - $::tcl_precision} }

Another "real" problem can be observed when formatting doubles (which does an implicit rounding), even far away from the limits of precision:

% format %.1f 1.55 1.6 % format %.1f 2.55 2.5 .. % format %.1f 7.55 7.5 % format %.1f 8.55 8.6This is partly platform-dependent:

format %8.2f 1.625 ##(8.2.3/NT) 1.63 format %8.2f 1.625 ##(8.0.5/Solaris) 1.62 ut this is not a Tcl bug, a C program with (s)printf has the same behavior: main() { printf("%.1f\n", 0.55); printf("%.1f\n", 1.55); printf("%.1f\n", 2.55); printf("%.1f\n", 3.55); printf("%.1f\n", 4.55); printf("%.1f\n", 5.55); printf("%.1f\n", 6.55); printf("%.1f\n", 7.55); printf("%.1f\n", 8.55); printf("%.1f\n", 9.55); } $ cc -o x x.c $ ./x 0.6 1.6 2.5 3.5 4.5 5.5 6.5 7.5 8.6 9.6A brief workaround can be on the Tcl side

proc fformat {value precision} { set v [expr {$value + 0.5 * pow (10, -$precision)}] string range $v 0 [expr [string first "." $v] + $precision] }This mostly treats the "round at 0.5" threshold correctly, except when reaching the limits of precision:

fformat 47.11499999999 2 47.12 <----- my cup runneth over.. fformat 47.1149999999 2 47.11A deeper cut would go, depending on platform, into the C side, as Matthias Hammelsbeck remarks:I propose a workaround for Solaris (which can be used not only for tcl):1. In directory tcl/unix create the file tclUnixRound.c:

#include "tclInt.h" /* Seems to work correctly until precision <= 10 at least : simply add a small amount to the value*/ double TclpGoodround (double value, int precision) { return value + 0.000000000000001; }2. For Windows in directory tcl/win create the file tclWinRound.c. Since under Windows the format proc works as expected, we do nothing here:

#include "tclInt.h" double TclpGoodround (double value, int precision) { return value; }3. Declare the new function in tcl/generic/tclInt.h4. In tcl/generic/tclCmdAH.c, insert before the line with whichValue = DOUBLE_VALUE; (this is line 2212 in Tcl 8.3.1) the following statement:

doubleValue = TclpGoodround (doubleValue, precision);5. Compile, link and enjoy!!As result you have no difference in behaviour between tcl under Solaris and under WIndows NT.

KBK 2000-11-01:This problem has no good solution on a binary machine.If you look at the number 0.55, it's actually a repeating fraction in binary:

____ 0.1000110011001100....(If you'd rather use hex, it's 0.8CCCCC...)The problem is that these numbers get truncated at different points, unavoidably, when represented on a machine to finite precision. The difference in the rounding when the integer part changes relates to the fact that the fraction part may be rounded itself! For instance, 0.55, after it's been converted to a double, is in binary

.1000110011001100110011001100110011001100110011001101because it's been rounded to the nearest double. Reconverting that number to decimal gives 0.55000000000000004. 2.55, on the other hand, rounds downward when rounding to the nearest double:

10.10001100110011001100110011001100110011001100110011and reconverts to 2.54999999999999982. (Both of these strings of bits are assuming IEEE-754 representation.)This problem arises most often when programmers make the mistake of trying to deal with currency (dollars and cents, pounds and p, marks and pfennigs, whatever...) as binary floating point. Bankers find the error of one unit in the last place troubling. The right thing to do with this would be to have a "currency" package that mirrors expr but maintains high-precision (64 bit?) numbers scaled to a specific number of decimal digits (so that one deals with an integer number of pennies). The only languages I know of that do currency correctly "out of the box" are (shudder!) Visual Basic and Cobol.For the curious, here's a Tcl program that formats the number itself, to show what's going on as printf() does its thing.

set tcl_precision 17 # Donal Fellows's K combinator allows surgery on lists. proc K { x y } { set x } # The 'formbinary' procedure formats a floating-point fraction in # binary. proc formbinary { num } { set s 0. for { set i 0 } { $i < 52 } { incr i } { set num [expr { $num + $num }] set digit [expr { int($num) }] set num [expr { $num - $digit }] append s $digit } return $s } # The 'formfract' procedure formats a floating-point number without # resorting to the C implementation of 'printf'. It demonstrates that # the rounding is correct! proc formfract { n { p 6 } } { # extract integer part set s [expr { int($n) }] set r [expr { $n - $s }] # if formatting to 0 places, just return. Otherwise, stick in the # decimal point if { $p == 0 } { return $s } # make the string of decimal digits puts "the fractional part of $n is $r" puts " in binary, it's [formbinary $r]" for { set i 0 } { $i < $p } { incr i } { set r [expr { 10.0 * $r }] set d [expr { int($r) }] lappend digits $d set r [expr { $r - $d }] puts "digit $i after the decimal point is $d, and the remaining" puts " fraction is $r" puts " in binary, it's [formbinary $r]" } # Put out a debug line to show what's going on. puts "before rounding, value is ${s}.[join $digits {}]" puts " and the remaining fraction is $r" puts " in binary, it's [formbinary $r]" # Handle rounding. if { $r >= 0.5 } { # Increment the decimal integer represented by the string of digits # in $digits. Start with the rightmost digit. set carry 0 set i [expr { $p - 1 }] while { $i >= 0 } { # Increment one digit, and record whether there's a carry to # the next digit. set d [lindex $digits $i] incr d if { $d >= 10 } { set d 0 set carry 1 } else { set carry 0 } set digits [lreplace [K $digits [set digits {}]] $i $i $d] # Keep moving left until there's no carry if { $carry } { incr i -1 } else { break } } # Handle a carry into the integer part if { $i < 0 && $carry } { incr s } } # Put the fraction part onto the integer part append s . [join $digits {}] return $s } # Demonstrate that format actually is working. puts "0.55 eventually formats as [formfract 0.55 1]" puts {} puts "1.55 eventually formats as [formfract 1.55 1]" puts {} puts "2.55 eventually formats as [formfract 2.55 1]" puts {} puts "4.55 eventually formats as [formfract 4.55 1]" puts {} puts "8.55 eventually formats as [formfract 8.55 1]"Running the program gives:

the fractional part of 0.55 is 0.55000000000000004 in binary, it's 0.1000110011001100110011001100110011001100110011001101 digit 0 after the decimal point is 5, and the remaining fraction is 0.5 in binary, it's 0.1000000000000000000000000000000000000000000000000000 before rounding, value is 0.5 and the remaining fraction is 0.5 in binary, it's 0.1000000000000000000000000000000000000000000000000000 0.55 eventually formats as 0.6 the fractional part of 1.55 is 0.55000000000000004 in binary, it's 0.1000110011001100110011001100110011001100110011001101 digit 0 after the decimal point is 5, and the remaining fraction is 0.5 in binary, it's 0.1000000000000000000000000000000000000000000000000000 before rounding, value is 1.5 and the remaining fraction is 0.5 in binary, it's 0.1000000000000000000000000000000000000000000000000000 1.55 eventually formats as 1.6 the fractional part of 2.55 is 0.54999999999999982 in binary, it's 0.1000110011001100110011001100110011001100110011001100 digit 0 after the decimal point is 5, and the remaining fraction is 0.49999999999999822 in binary, it's 0.0111111111111111111111111111111111111111111111111000 before rounding, value is 2.5 and the remaining fraction is 0.49999999999999822 in binary, it's 0.0111111111111111111111111111111111111111111111111000 2.55 eventually formats as 2.5 the fractional part of 4.55 is 0.54999999999999982 in binary, it's 0.1000110011001100110011001100110011001100110011001100 digit 0 after the decimal point is 5, and the remaining fraction is 0.49999999999999822 in binary, it's 0.0111111111111111111111111111111111111111111111111000 before rounding, value is 4.5 and the remaining fraction is 0.49999999999999822 in binary, it's 0.0111111111111111111111111111111111111111111111111000 4.55 eventually formats as 4.5 the fractional part of 8.55 is 0.55000000000000071 in binary, it's 0.1000110011001100110011001100110011001100110011010000 digit 0 after the decimal point is 5, and the remaining fraction is 0.50000000000000711 in binary, it's 0.1000000000000000000000000000000000000000000000100000 before rounding, value is 8.5 and the remaining fraction is 0.50000000000000711 in binary, it's 0.10000000000000000000000000000000000000000000001000008.55 eventually formats as 8.6

*DKF*- On a more general note, you can only accurately represent a fraction (rational number) in "decimal" form in a finite number of digits if the rational's denominator's prime factors are also prime factors of the number base which you are trying to represent it in (for decimal numbers, the prime factors are 2 and 5,) assuming that the GCD of the numerator and denominator is 1. All other rationals can only be represented as a recurring fraction with some prefix of finite length and then another finite digit string repeated an infinite number of times. I don't quite remember the proof right now...Lars H 2003-0-24: As for the proof: think of doing long division on the fraction. The quotient (as a "decimal" fraction) terminates if and only if at some point you get remainder 0 at a point after all digits in the numerator have been used up. When all digits in the numerator have been used up, computing the next remainder amounts to multiplying by the base (10 if you're doing decimal, 2 if you're doing binary) and %-ing with the denominator. This can be seen simply as multiplying with the base in modulo-the-denominator arithmetic. Hence if at some point you have the remainder 1 you can only get a remainder 0 if the base is nilpotent modulo-the-denominator, or returning to integers, if there is some power of the base which is a multiple of denominator. As claimed, this only happens if the denominator is a product of primes each of which divides the base.

KBK 2000-11-02: Yeah, I remember the theorem (but not the proof) too. All of which doesn't change the fact that accountants have very specific rules for the rounding of arithmetic involving currency. These rules are well-nigh-impossible to implement in binary floating point arithmetic, but instead require large-precision integers scaled by powers of ten. At least we seldom have to deal with mixed radix currencies any more; dealing with pounds-shillings-pence was quite an £sd trip.

£2 16s 2 3/4dwas read as "two pounds, sixteen shillings tuppence ha'penny farthing").

RS: Cool, an unreal problem! Except for fractions of old pence, the following converts decimal GBP amounts to L-s-d, and Unicoded:

proc dec2trad amount { set L [expr int($amount)] set s [expr int(($amount-$L)*20)] set d [expr int(($amount-$L-$s/20.)*240+0.5)] set res "" if $L {append res "\u20A4$L"} if $s {append res " ${s}s"} if $d {append res " ${d}d"} set res } dec2trad 1.98 £1 19s 7d dec2trad 2.34 £2 6s 10dOr, as I'm just being told by the Web, there was also an optional £x/y/z style, where a dash expressed zero in the shilling and penny positions. So here's version 0.2, where a second argument different from space produces the slash notation:

proc dec2trad {amount {slash " "}} { set L [expr int($amount)] set s [expr int(($amount-$L)*20)] set d [expr int(($amount-$L-$s/20.)*240+0.5)] set res "" if {$slash==" "} { if $L {append res "\u20A4$L"} if $s {append res " ${s}s"} if $d {append res " ${d}d"} } else { if $L {append res "\u20A4$L/"} if $s {append res $s} else {append res -} if $d {append res /$d} else {append res /-} } set res } dec2trad 10.01 / £10/-/2 dec2trad .10 / 2/-

KBK 2000-11-02: Once again, this is just asking for roundoff problems in binary floating point arithmetic. To get correct behavior near the corners, the internal representation of the amount must be the count of farthings. For interest calculations, guard digits are required as well, and the way it's done is well known to the accountants. IEEE-754 does

*not*do it right. It's not intended for the purpose.Thanks, Richard, by the way, for giving me the £ sign; I couldn't figure how to get it into Netscape from a US keyboard!

[John Hermann] wrote in comp.lang.tcl:

set a 9233.8 set f 100 set z [expr int($a * $f)] ;# => 923379but in two steps you'll get it right:

set a 9233.8 set f 100 set z1 [expr $a * $f] set z [expr int($z1)] ;# => 923380This can also be shown in C:

#include <math.h> main () { int i=100; double j=9233.8; printf("%d\n", (int) (i*j)); }Now, if you change the "double" variable to a "float" variable, you get the right answer.

*[Sergei Kucherov] commented:*I suppose your example shows that it is better to use "round(...)" rather than "int(...)" when you are expecting an exact integer answer to a double precision floating point computation. Or else to compute with less precise float values, which seems counter-intuitive. This is a good lesson to anyone who computes with floating values (hmmm, web commerce).I am surprised that enough guard digits aren't maintained when doing the computation to give the right answer. It appears from your experiments that the same algorithm is used to do a double multiply in the different environments you tested. I wonder how "special" the example you found is?

*RS*: Here we also see a difference between braced and unbraced expr:

set a 9233.8 set f 100 set x [expr $a*$f] expr int($x) ;# -> 923380 expr {int($x)} ;#-> 923379The first expr produces an object with double and string representation. If you call int($x) unbraced, the Tcl parser substitutes $x with "923380.0". If you call int($x) with braces, the string "int($x)" is parsed by expr, for $x the double value is substituted, but that is slightly less than 923380.0.

Fabrice Pardo <[email protected]> commented on this aspect:The Tcl result,

% puts [expr {int(100*9233.8)}] 923379, is perfectly compliant with IEEE standard representation of floating numbers. You should use round to convert from double to int

% puts [expr {round(100*9233.8)}] 923380This is because internal floating double (IEEE standard) representation of 9233.8 is impossible. The closest approximation is 9233.79999.. This value multiplied by 100. gives 923379.999999 The "int" truncation gives 923379.The "float" closest approximation is 9233.800..xFor other numbers, you can get the opposite result, float < double.These procedures are useful to show the IEEE standard representation of floating double numbers on BigEndian machines. For LittleEndian procedures, see IEEE floating numbers

proc floatToBinar {d} { binary scan [binary format d $d] B* v set sign [string index $v 0] set exponent [string range $v 1 11] set mantissa [string range $v 12 end] return [list $sign $mantissa $exponent] } proc binarToFloat {sign mantissa exponent} { if {$sign != "0" && $sign != "1"} { error "bad sign \"$sign\"" } if {[string length $mantissa] != 52} { error "bad mantissa \"$mantissa\"" } if {[string length $exponent] != 11} { error "bad exponent \"$exponent\"" } set v [binary format B64 $sign$exponent$mantissa] binary scan $v d v return $v }

% floatToBinar 1 0 0000000000000000000000000000000000000000000000000000 01111111111

- The first 0 is the sign +,
- The 52 bits number is the mantissa, in the form 1.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx first bit has the decimal 0.5 value second bit has the decimal 0.25 value, etc.
- The 11 bits number is the 2-exponent multiplicator :

01111111111 has the value 2^0 = 1 10000000000 has the value 2^1 = 2 10000000001 has the value 2^2 = 4 10000000010 has the value 2^3 = 4 ... 11111111110 has the value 2^1023, approximately 8.98846567431e+307 01111111101 has the value 2^(-2) = 1/4 01111111110 has the value 2^(-1) = 1/2 ... 00000000001 has the value 2^(-1022), approximately 2.22507385851e-308 % floatToBinar 9233.8 0 0010000010001110011001100110011001100110011001100110 10000001100 0 -> sign + 0010000010001110011001100110011001100110011001100110 -> 1 + 0/2 + 0/4 + 1/8 + 0/16 ... = 1 + 572735606908518/4503599627370496 = 1.12717285156... 10000001100 -> 2^13 = 8192

*Robert Heller <[email protected]> explains how guard bits help precision:*Guard digits (really guard bits, since modern computers use binary FP), are extra digits that are used to extend the precision of the fp operations to avoid the kinds of roundoff error typical of fixed-precision floating point arithmetic. These guard bits live in the FP hardware, and so how many depend on the processor itself. Note: since Tcl is not compiled to native assembler with the usual processor-dependent optimizations, it is likely the the guard bits are generally lost when you do a sequence of Tcl FP operations. A C or FORTRAN compiler when faced with something like:

#include <math.h> double a,b,c; a = (sqrt((b * b)+(c * c))) * tan(M_PI_2); /* Silly useless math */will leave the various temp (intermediate) results in registers inside the FPU, where they will be like 80 bits (vs. 64 bits when in CPU registers or main memory). Thus there will be some 16 guard bits to 'absorb' the random low-order junk bits. Tcl does not do this (AFAIK). The temp (intermediate) results are stored as strings (old Tcl) or as 64 bit doubles (current Tcl). The guard bits are lost and round-off creeps in.Tcl is NOT good for serious heavy math. If you have some math (floating point) function of some complexity and where precision is needed, you would be well advised to code this function in C and make it a loadable extension.AM 2004-07-16: Actually, the extra precision is a nuisance more than it helps getting more accurate answers. The reason is that there are (with current-day processors such as the Intel ones that use this feature) too few registers that can hold the guard bits to make it work properly.The result is that the outcome of a computation depends on details of the compilation process you have no control over. It is even possible to write perfectly correct programs (correct from the point of view of the programming language) and still end up with the wrong answer when running it. Here is a trivial example (well, the only one I know of that demonstrates this so clearly):

/* Naive program to determine epsilon */ #include <stdlib.h> #include <stdio.h> void check(float a, float b) { printf( "a = %25.17g, b = %25.17g\n", a, b ) ; if ( a != b ) { printf( "eps is correct\n" ) ; } else { printf( "wrong answer!\n" ) ; } return ; } int main( int argc, char *argv[] ) { float eps ; float x ; float y ; float z ; x = 1.0f ; y = 1.0f ; z = 1.0f ; eps = 1.0f ; /* Avoid aggressive optimisation ... */ while ( x+eps/2.0f != y ) { eps = eps / 2.0f ; } z = z + eps ; printf( "eps = %g\n", eps ) ; check(z,y) ; return 0; }Note: To assure that no intermediate results are cached in the processor, the program is more elaborate than you might deem necessary. That is one of the reasons the whole feature is very annoying at least and IMO downright dangerous!AM 2004-08-13: Andy Goth ran this program on a DEC Alpha computer, using gcc (with default options) and got correct results. (DEC Alphas have excellent floating-point characteristics, but I have seen correct behaviour on Sun SPARC computers too).

LV: Anyone familar with this package - is this something that would be useful to add into a Tcl extension?

name: fcmp 1.2.1 posted on: Nov 02nd 2000, 23:34 EST license: LGPL category: Development/Libraries homepage: http://freshmeat.net/projects/fcmp/homepage/ download: http://freshmeat.net/projects/fcmp/download/ description: It is generally not wise to compare two floating-point values for exact equality, for example using the C == operator. The fcmp package implements Knuth's suggestions for safer floating-point comparison operators as a C function. changes: A minor configuration cleanup was made. urgency: low |> http://freshmeat.net/projects/fcmp/

See also http://python.sourceforge.net/devel-docs/tut/node14.html , for pertinent background information on computing arithmetic.

Arjen Markus: I have been working on an extension of the functions supported by expr that takes care of many of these issues. A first version is available, as well as a Tcl-only implementation of (for now) a small set of these. To be published soon - I hope.

Kevin Kenny 2003-03-24, writes on comp.lang.tcl that he considers the default tcl_precision to be a bug, and even plans to fix it in Tcl 9:

*Everything is still a string. All the nonsense about internal representations is there to deal with performance issues. If you can tell in another way that something*'isn't

*' a string, that's a bug.*KBK: Gee, does everything I say on c.l.t end up on the Wiki? :)Anyway, let me fill in a little more of the detail. The main reason that tcl_precision is the way it is has to do with the fact that

*Tcl_PrintDouble*depends on

*sprintf*to do its work, and the quality of implementation of most

*sprintf*(

*strtod*, actually) functions is rather poor. There's absolutely no excuse for:

% set tcl_precision 17 17 % expr 0.1 0.10000000000000001 % expr 0.2 0.20000000000000001 % expr 0.3 0.29999999999999999At least, there's no excuse for it in the last decade, since David Gay's implementations of correctly rounded conversions were released. (Gay's technical report is available at [1], and his code at [2].)If Tcl implements correctly rounded conversions, then the need for tcl_precision largely goes away, except for those that abuse floating point for doing arithmetic on currency. In any case, setting tcl_precision to anything other than 17 causes incorrect behavior in many numerical calculations. It creates string representations that don't match the numeric ones. It causes answers that are simply bewildering:

% set tcl_precision 12 12 % set x [expr 1.234567890123] 1.23456789012 % set y [expr 1.234567890124] 1.23456789012 % expr { $x == $y } 0 % expr { $x eq $y } 1TIP #132 has the specification of the proposed change.

AM 2004-07-15: I wrote a Computers and real numbers to present some of the material here in an easy-to-read kind of way.

TV: Not sure how that is understood, I made a page about synthesizing curves, which IMNSHO is relevant as to indicate practice in certain ways, where it concerns actual data and problems where numerical approximations are at stake.

RS 2004-12-20: To put it short, in C and Tcl there are no "real" numbers. All we can precisely express is rationals that satisfy

a/pow(2,b)with a and b being integers (and not any integers, just a subset - b going up to 208 or so...)

beernutmark 2011-05-02 - Or use Decimal Arithmetic Package for tcl 8.5 which adds precise decimal arithmetic functionality to tcl.