'''Purpose:''' Show how to improve the output of the math function [rand] to reduce sequential correlation. ---- The sequence of numbers returned by [rand] is prone to sequential correlations if large runs of numbers are taken. For example, if clusters of six sequential results from expr { rand () } are analyzed, it will turn out that they lie on only 40 hyperplanes in six-dimensional space. This limitation means that rand() is not suitable by itself for applications like high-resolution Monte Carlo integration. The following code implements a "better" rand that reduces this problem by shuffling the sequence that is returned. For further discussion, see [Donald E. Knuth], ''The Art of Computer Programming, Volume 2: Seminumerical Algorithms,'' which has a wealth of information about random number generators and their foibles. ---- **Implementation using array** namespace eval BayesDurham { namespace export rand variable poolSize 97; # Size of the pool of random numbers variable v; # Pool of random numbers variable y; # Most recently generated random number } #---------------------------------------------------------------------- # # BayesDurham::rand -- # # Generate a random floating point number between 0 and 1 # # Parameters: # None. # # Results: # Returns a randomly chosen floating point number in the # range [0 .. 1) # # Side effects: # Stores state in several namespace variables. # # The BayesDurham::rand procedure attempts to improve upon the # system-supplied rand() function by breaking up sequential # correlations. It works by shuffling a vector of numbers supplied by # the system's random number generator, using its own output to choose # the next slot in the vector. # #---------------------------------------------------------------------- proc BayesDurham::rand {} { variable poolSize variable v variable y # Test whether we're initializing the sequence if { ![info exists v] } { # Discard some random numbers - in case the RNG was just # seeded. for { set i 0 } { $i < $poolSize } { incr i } { expr { rand() } } # Save a pool of random numbers for { set i 0 } { $i < $poolSize } { incr i } { set v($i) [expr { rand() }] } # Generate one more random number set y [expr { rand() }] } # Use the most recently generated number to select from the pool. set i [expr { int( double( $poolSize ) * $y ) }] set y $v($i) set v($i) [expr { rand() }] return $y } ---- **Implementation using list** [EKB] Here's an alternative implementation: namespace eval shufflerand { variable vals variable vallen 98 } proc shufflerand::init {args} { variable vals variable vallen set haveseed false foreach {arg argval} $args { switch -- $arg { -len { if [string is integer $argval] { if {$argval > 0} { set vallen $argval } else { error "[namespace current]::init: Length must be greater than zero" } } else { error "[namespace current]::init: Length must be an integer" } } -seed { if [string is integer $argval] { set seed $argval set haveseed true } else { error "[namespace current]::init: Seed must be an integer" } } default { error "[namespace current]::init: Possible arguments are: -seed -len" } } } set vals {} if $haveseed { lappend vals [expr {srand($seed)}] } else { lappend vals [expr {rand()}] } for {set i 1} {$i < $vallen} {incr i} { lappend vals [expr {rand()}] } } proc shufflerand::rand {} { variable vals variable vallen set retval [lindex $vals end] set ndx [expr {int($vallen * $retval)}] lset vals end [lindex $vals $ndx] lset vals $ndx [expr {rand()}] return $retval } # Test console show shufflerand::init puts "A list of 10 random numbers from shuffled list:" for {set i 0} {$i < 10} {incr i} { puts [shufflerand::rand] } puts "\nTime to generate 1000 values using rand():" puts [time {for {set i 0} {$i < 1000} {incr i} {expr {rand()}}}] puts "\nTime to generate 1000 values using shufflerand:" puts [time {for {set i 0} {$i < 1000} {incr i} {shufflerand::rand}}] Here's some sample output from the test code at the end: A list of 10 random numbers from shuffled list: 0.00284572923688 0.897247713477 0.649551953492 0.456922465682 0.0448791366279 0.0441727326457 0.232282039818 0.645288598093 0.753969432672 0.207090688034 Time to generate 1000 values using rand(): 558 microseconds per iteration Time to generate 1000 values using shufflerand: 4052 microseconds per iteration So, it's about 7-8 times slower than the straight linear congruential generator that's bundled with Tcl. ---- [EKB] Hm...Now using Tcl 8.5, rand() takes a little longer, shufflerand a little less long. Here are some runs for rand(), shufflerand, and Bayes-Durham: Time to generate 1000 values using rand(): 738 microseconds per iteration Time to generate 1000 values using shufflerand: 3202 microseconds per iteration Time to generate 1000 values using Bayes-Durham rand: 3941 microseconds per iteration So slightly longer with Bayes-Durham (about 20%). Possibly because of the test for whether the pool had been created or not. I decided not to include that in the "shufflerand" proc because typically the call to rand() is in the heart of a loop. I also cached the pool size. ---- **Implementation for 8.5 using list** [EKB] This is a version for Tcl 8.5 that replaces the existing rand() with a shuffled version, so code that uses rand() doesn't have to be rewritten: namespace eval shufflerand { variable vals } proc shufflerand::init {} { variable vals for {set i 0} {$i < 98} {incr i} { lappend vals [expr {rand()}] } # lcrand -- "Linear Congruential rand" rename ::tcl::mathfunc::rand ::tcl::mathfunc::lcrand proc ::tcl::mathfunc::rand {} "[namespace current]::rand" } proc shufflerand::rand {} { variable vals set retval [lindex $vals end] set ndx [expr {int(98 * $retval)}] lset vals end [lindex $vals $ndx] lset vals $ndx [expr {lcrand()}] return $retval } # Test console show puts "\nTime to generate 1000 values using default rand():" puts [time {for {set i 0} {$i < 1000} {incr i} {expr {rand()}}}] shufflerand::init puts "\nTime to generate 1000 values using shuffled rand():" puts [time {for {set i 0} {$i < 1000} {incr i} {expr {rand()}}}] ---- **The effect on shuffling on the period** [Lars H]: I claim in [Random Number] that shuffling such as the above doesn't do much for the period of the PRNG. The following is the reasoning that led me to that conclusion. ***Modified implementation for analysis*** Not all of the data present in the numbers being shuffled are part of the practical state of the shuffler. Concretely, it is possible to view the numbers merely as indices into the array of numbers and still operate the shuffler exactly as above. To illustrate this follows here a modification of the first implementation which keeps the values and indices as separate arrays. namespace eval BayesDurham { namespace export rand variable poolSize 97 ;# Size of the pool of random numbers variable value ;# Pool of random numbers variable index ;# Corresponding index values variable y ;# Index of last output } proc BayesDurham::num2index {num} { variable poolSize return [expr { int( double($poolSize) * $num ) }] } proc BayesDurham::initFromRand {} { variable poolSize variable value variable index variable y # Discard some random numbers - in case the RNG was just seeded. for {set i 0} {$i < $poolSize} {incr i} {expr rand()} # Save a pool of random numbers for {set i 0} {$i < $poolSize} {incr i} { set value($i) [expr rand()] set index($i) [num2index $value($i)] } # Generate "last" index set y [num2index [expr rand()]] } proc BayesDurham::initFromRandIfNecessary {} { initFromRand # Turn self into no-op. proc initFromRandIfNecessary {} {} } proc BayesDurham::rand {} { variable value variable index variable y initFromRandIfNecessary set i $y set y $index($i) set r $value($i) set value($i) [expr rand()] set index($i) [num2index $value($i)] return $r } Here, the ''value'' array has no influence on how the numbers are shuffled; everything is controlled by the ''index'' array and the ''y'' variable. These entries take values in the range 0 through $poolSize-1, so the shuffling extends the PRNG state space relevant for the period by a factor pow($poolSize, $poolSize+1). It doesn't extend the period by that much, however. ***Probabilistic analysis*** [[To be continued.]] ---- !!!!!! %| [Category Mathematics] | [Category Statistics] |% !!!!!!