Version 0 of Sequential correlations and the output of rand

Updated 2001-04-23 21:15:19

Purpose: Show how to improve the output of the math function rand to reduce sequential correlation.


The sequence of numbers returned by rand is prone to sequential correlations if large runs of numbers are taken. For example, if clusters of six sequential results from

    expr { rand () }

are analyzed, it will turn out that they lie on only 40 hyperplanes in six-dimensional space. This limitation means that rand() is not suitable by itself for applications like high-resolution Monte Carlo integration.

The following code implements a "better" rand that reduces this problem by shuffling the sequence that is returned. For further discussion, see Donald Knuth, The Art of Computer Programming, Volume 2: Seminumerical Algorithms, which has a wealth of information about random number generators and their foibles.


 namespace eval BayesDurham {

     namespace export rand

     variable poolSize 97;               # Size of the pool of random numbers

     variable v;                         # Pool of random numbers

     variable y;                         # Most recently generated random number

 }

 #----------------------------------------------------------------------
 #
 # BayesDurham::rand --
 #
 #       Generate a random floating point number between 0 and 1
 #
 # Parameters:
 #       None.
 #
 # Results:
 #       Returns a randomly chosen floating point number in the 
 #       range [0 .. 1)
 #
 # Side effects:
 #       Stores state in several namespace variables.
 #
 # The BayesDurham::rand procedure attempts to improve upon the 
 # system-supplied rand() function by breaking up sequential
 # correlations.  It works by shuffling a vector of numbers supplied by
 # the system's random number generator, using its own output to choose
 # the next slot in the vector.
 #
 #----------------------------------------------------------------------

 proc BayesDurham::rand {} {

     variable poolSize
     variable v
     variable y

     # Test whether we're initializing the sequence

     if { ![info exists v] } {

         # Discard some random numbers - in case the RNG was just
         # seeded.

         for { set i 0 } { $i < $poolSize } { incr i } {
             expr { rand() }
         }

         # Save a pool of random numbers

         for { set i 0 } { $i < $poolSize } { incr i } {
             set v($i) [expr { rand() }]
         }

         # Generate one more random number

         set y [expr { rand() }]

     }

     # Use the most recently generated number to select from the pool.

     set i [expr { int( double( $poolSize ) * $y ) }]
     set y $v($i)
     set v($i) [expr { rand() }]

     return $y

 }