Issues with the lack of double precision and a proposed solution.

bug-gsl

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Issues with the lack of double precision and a proposed solution.

From:	camel-cdr
Subject:	Issues with the lack of double precision and a proposed solution.
Date:	Tue, 09 Feb 2021 21:47:19 +0000

Currently, `gsl_rng_uniform is generator dependent and is in most cases 
implemented by dividing the output of `gsl_rng_get()` by `gsl_rng_max() + 1`, 
as stated in the docs. This has an obvious problem, the resolution of the 
generated floating points is solely dependent on the generators range.

None of the currently implemented generators have a range larger than 2^32 and 
most are smaller, which means that we can never even generate a random double 
to double precision. This issue is a lot bigger as many random distributions 
use `gsl_rng_uniform` internally and/or only return up to single-precision 
floating-point precision, as is the case with `gsl_ran_gaussian_ziggurat`.


But if we want to generate a random double-precision floating-point number in 
the range [0,1), how many bits would be required?

First, let's consider that we can't generate random floats uniformly using 
division by an integer:
   Integer: |----|----|----|----|----|----|----|----|----|----|----|----|

            |    |    /    |   /    /    /    /      \ __/    \ __/     |

   Float:   ||||-|-|-|--|--|--|----|----|----|--------|--------|--------|
There is no 1 to 1 mapping between an integer and a floating-point number. 
Multiple integer values are mapped to the same floating-point value and/or 
smaller floating-point values are skipped.
(see the bottom of http://prng.di.unimi.it/ and the github link below)

One method for generating random floats is to divide by the biggest number that 
doesn't map multiple integer values to the same floating-point number. That 
would be 2^53 for doubles. This is a good compromise between accuracy and 
performance and should be the default for random double generation.

Hence, I propose adding a floating-point distribution function gsl_rand_uniform 
(and maybe gsl_rand_uniformf) that guarantee this precision for all generators.
To make implementing this easier an additional generic interface could be added 
to each RNG: `void gsl_rng_write(gsl_rng *r, void *ptr, size_t n)`, that writes 
to `n` bytes of random data to `ptr`.

There is another method for generating random floating-point values, that 
includes all representable numbers inside [a,b]. I won't explain it here, but 
I've written an implementation of this in 
https://github.com/camel-cdr/cauldron/blob/main/cauldron/random.h#L1353 and 
this could be added as something like `double gsl_rand_uniform_dense(gsl_rng 
*r, double min, double max)`.

I'd like some feedback on these proposed changes, so I can write a patch in the 
future.

Best regards,
Olaf

[Prev in Thread]

Current Thread

[Next in Thread]

Issues with the lack of double precision and a proposed solution., camel-cdr <=

Prev by Date: [bug #60026] Incorporate MIXMAX random number extension into GSL
Next by Date: [bug #50382] Add CMAKE and NUGET support for Windows
Previous by thread: [bug #60026] Incorporate MIXMAX random number extension into GSL
Next by thread: [bug #50382] Add CMAKE and NUGET support for Windows
Index(es):
- Date
- Thread