Generation methods
Generation methods[edit]
Physical methods[edit]
The earliest methods for generating random numbers, such as dice, coin flipping and roulette wheels, are still used today, mainly in games and gambling as they tend to be too slow for most applications in statistics and cryptography.
A physical random number generator can be based on an essentially random atomic or subatomic physical phenomenon whose unpredictability can be traced to the laws of quantum mechanics. Sources of entropy include radioactive decay, thermal noise, shot noise, avalanche noise in Zener diodes, clock drift, the timing of actual movements of a hard disk read-write head, and radio noise. However, physical phenomena and tools used to measure them generally feature asymmetries and systematic biases that make their outcomes not uniformly random. A randomness extractor, such as a cryptographic hash function, can be used to approach a uniform distribution of bits from a non-uniformly random source, though at a lower bit rate.
The appearance of wideband photonic entropy sources, such as optical chaos and amplified spontaneous emission noise, greatly aid the development of the physical random number generator. Among them, optical chaos[3][4] has a high potential to physically produce high-speed random numbers due to its high bandwidth and large amplitude. A prototype of a high speed, real-time physical random bit generator based on a chaotic laser was built in 2013.[5]
Various imaginative ways of collecting this entropic information have been devised. One technique is to run a hash function against a frame of a video stream from an unpredictable source. Lavarand used this technique with images of a number of lava lamps. HotBits measures radioactive decay with Geiger–Muller tubes,[6] while Random.org uses variations in the amplitude of atmospheric noise recorded with a normal radio.
Another common entropy source is the behavior of human users of the system. While people are not considered good randomness generators upon request, they generate random behavior quite well in the context of playing mixed strategy games.[7] Some security-related computer software requires the user to make a lengthy series of mouse movements or keyboard inputs to create sufficient entropy needed to generate random keys or to initialize pseudorandom number generators.[8]
Computational methods[edit]
Most computer generated random numbers use PRNGs which are algorithms that can automatically create long runs of numbers with good random properties but eventually the sequence repeats (or the memory usage grows without bound). These random numbers are fine in many situations but are not as random as numbers generated from electromagnetic atmospheric noise used as a source of entropy.[9] The series of values generated by such algorithms is generally determined by a fixed number called a seed. One of the most common PRNG is the linear congruential generator, which uses the recurrence
to generate numbers, where a, b and m are large integers, and is the next in X as a series of pseudorandom numbers. The maximum number of numbers the formula can produce is the modulus, m. The recurrence relation can be extended to matrices to have much longer periods and better statistical properties .[10] To avoid certain non-random properties of a single linear congruential generator, several such random number generators with slightly different values of the multiplier coefficient, a, can be used in parallel, with a "master" random number generator that selects from among the several different generators.[citation needed]
A simple pen-and-paper method for generating random numbers is the so-called middle-square method suggested by John von Neumann. While simple to implement, its output is of poor quality. It has a very short period and severe weaknesses, such as the output sequence almost always converging to zero. A recent innovation is to combine the middle square with a Weyl sequence. This method produces high quality output through a long period.[11]
Most computer programming languages include functions or library routines that provide random number generators. They are often designed to provide a random byte or word, or a floating point number uniformly distributed between 0 and 1.
The quality i.e. randomness of such library functions varies widely from completely predictable output, to cryptographically secure. The default random number generator in many languages, including Python, Ruby, R, IDL and PHP is based on the Mersenne Twister algorithm and is not sufficient for cryptography purposes, as is explicitly stated in the language documentation. Such library functions often have poor statistical properties and some will repeat patterns after only tens of thousands of trials. They are often initialized using a computer's real-time clock as the seed, since such a clock is 64 bit and measures in nanoseconds, far beyond the person's precision. These functions may provide enough randomness for certain tasks (for example video games) but are unsuitable where high-quality randomness is required, such as in cryptography applications, statistics or numerical analysis.[citation needed]
Much higher quality random number sources are available on most operating systems; for example /dev/random on various BSD flavors, Linux, Mac OS X, IRIX, and Solaris, or CryptGenRandom for Microsoft Windows. Most programming languages, including those mentioned above, provide a means to access these higher quality sources.
By humans[edit]
Random number generation may also be performed by humans, in the form of collecting various inputs from end users and using them as a randomization source. However, most studies find that human subjects have some degree of non-randomness when attempting to produce a random sequence of e.g. digits or letters. They may alternate too much between choices when compared to a good random generator;[12] thus, this approach is not widely used.
Post-processing and statistical checks[edit]
Even given a source of plausible random numbers (perhaps from a quantum mechanically based hardware generator), obtaining numbers which are completely unbiased takes care. In addition, behavior of these generators often changes with temperature, power supply voltage, the age of the device, or other outside interference. And a software bug in a pseudorandom number routine, or a hardware bug in the hardware it runs on, may be similarly difficult to detect.
Generated random numbers are sometimes subjected to statistical tests before use to ensure that the underlying source is still working, and then post-processed to improve their statistical properties. An example would be the TRNG9803[13] hardware random number generator, which uses an entropy measurement as a hardware test, and then post-processes the random sequence with a shift register stream cipher. It is generally hard to use statistical tests to validate the generated random numbers. Wang and Nicol[14] proposed a distance-based statistical testing technique that is used to identify the weaknesses of several random generators. Li and Wang[15] proposed a method of testing random numbers based on laser chaotic entropy sources using Brownian motion properties.
Other considerations[edit]
Reshaping the distribution[edit]
Uniform distributions[edit]
Most random number generators natively work with integers or individual bits, so an extra step is required to arrive at the "canonical" uniform distribution between 0 and 1. The implementation is not as trivial as dividing the integer by its maximum possible value. Specifically:[16][17]
- The integer used in the transformation must provide enough bits for the intended precision.
- The nature of floating-point math itself means there exists more precision the closer the number is to zero. This extra precision is usually not used due to the sheer number of bits required.
- Rounding error in division may bias the result. At worst, a supposedly excluded bound may be drawn contrary to expectations based on real-number math.
The mainstream algorithm, used by OpenJDK, Rust, and NumPy, is described in a proposal for C++'s STL. It does not use the extra precision and suffers from bias only in the last bit due to round-to-even.[18] Other numeric concerns are warranted when shifting this "canonical" uniform distribution to a different range.[19] A proposed method for the Swift programming language claims to use the full precision everywhere.[20]
Uniformly distributed integers are commonly used in algorithms such as the Fisher–Yates shuffle. Again, a naive implementation may induce a modulo bias into the result, so more involved algorithms must be used. A method that nearly never performs division was described in 2018 by Daniel Lemire,[21] with the current state-of-the-art being the arithmetic encoding-inspired 2021 "optimal algorithm" by Stephen Canon of Apple Inc.[22]
Most 0 to 1 RNGs include 0 but exclude 1, while others include or exclude both.
Other distributions[edit]
Given a source of uniform random numbers, there are a couple of methods to create a new random source that corresponds to a probability density function. One method, called the inversion method, involves integrating up to an area greater than or equal to the random number (which should be generated between 0 and 1 for proper distributions). A second method, called the acceptance-rejection method, involves choosing an x and y value and testing whether the function of x is greater than the y value. If it is, the x value is accepted. Otherwise, the x value is rejected and the algorithm tries again.[23][24]
As an example for rejection sampling, to generate a pair of statistically independent standard normally distributed random numbers (x, y), one may first generate the polar coordinates (r, θ), where r2~χ22 and θ~UNIFORM(0,2π) (see Box–Muller transform).
Whitening[edit]
The outputs of multiple independent RNGs can be combined (for example, using a bit-wise XOR operation) to provide a combined RNG at least as good as the best RNG used. This is referred to as software whitening.
Computational and hardware random number generators are sometimes combined to reflect the benefits of both kinds. Computational random number generators can typically generate pseudorandom numbers much faster than physical generators, while physical generators can generate "true randomness."
Low-discrepancy sequences as an alternative[edit]
Some computations making use of a random number generator can be summarized as the computation of a total or average value, such as the computation of integrals by the Monte Carlo method. For such problems, it may be possible to find a more accurate solution by the use of so-called low-discrepancy sequences, also called quasirandom numbers. Such sequences have a definite pattern that fills in gaps evenly, qualitatively speaking; a truly random sequence may, and usually does, leave larger gaps.
Activities and demonstrations[edit]
The following sites make available random number samples:
- The SOCR resource pages contain a number of hands-on interactive activities and demonstrations of random number generation using Java applets.
- The Quantum Optics Group at the ANU generates random numbers sourced from quantum vacuum. Sample of random numbers are available at their quantum random number generator research page.
- Random.org makes available random numbers that are sourced from the randomness of atmospheric noise.
- The Quantum Random Bit Generator Service at the Ruđer Bošković Institute harvests randomness from the quantum process of photonic emission in semiconductors. They supply a variety of ways of fetching the data, including libraries for several programming languages.
- The Group at the Taiyuan University of Technology generates random numbers sourced from a chaotic laser. Samples of random number are available at their Physical Random Number Generator Service.
Comments
Post a Comment