From: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
---|---|
To: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: rand48 replacement |
Date: | 2021-07-04 11:30:41 |
Message-ID: | CAEZATCXcmccPQismOioD5kr2NMyMoeLpDaesizzoKh8M-BK5Dw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, 4 Jul 2021 at 10:35, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> wrote:
>
> I did not understand why it is not correct.
>
Well, to make it easier to visualise, let's imagine our word size is
just 3 bits instead of 64 bits, and that the basic prng() function
generates numbers in the range [0,8). Similarly, imagine a splitmix3()
that operates on 3-bit values. So it might do something like this
(state offset and return values made up):
splitmix3(state):
state=0 -> 5, return 2
state=1 -> 6, return 5
state=2 -> 7, return 0
state=3 -> 0, return 3
state=4 -> 1, return 6
state=5 -> 2, return 1
state=6 -> 3, return 7
state=7 -> 4, return 4
Now suppose we want a random number in the range [0,6). This is what
happens with your algorithm for each of the possible prng() return
values:
prng() returns 0 -- OK
prng() returns 1 -- OK
prng() returns 2 -- OK
prng() returns 3 -- OK
prng() returns 4 -- OK
prng() returns 5 -- OK
prng() returns 6 -- out of range so use splitmix3() with initial state=6:
state=6 -> 3, return 7 -- still out of range, so repeat
state=3 -> 0, return 3 -- now OK
prng() returns 7 -- out of range so use splitmix3() with initial state=7:
state=7 -> 4, return 4 -- now OK
So, assuming that prng() chooses each of the 8 possible values with
equal probability (1/8), the overall result is that the values 0,1,2
and 5 are returned with a probability of 1/8, whereas 3 and 4 are
returned with a probability of 2/8.
Using the correct implementation of the bitmask algorithm, each
iteration calls prng() again, so in the end no particular return value
is ever more likely than any other (hence it's unbiased).
As for determinism, the end result is still fully deterministic. For
example, lets say that prng() returns the following sequence, for some
initial state:
1,0,3,0,3,7,4,7,6,6,5,3,7,7,7,0,3,6,5,2,3,3,4,0,0,2,7,4,...
then the bitmask method just returns that sequence with all the 6's
and 7's removed:
1,0,3,0,3,4,5,3,0,3,5,2,3,3,4,0,0,2,4,...
and that same sequence will always be returned, when starting from
that initial seed.
Regards,
Dean
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2021-07-04 12:00:37 | Re: Mention --enable-tap-tests in the TAP section page |
Previous Message | Andrew Dunstan | 2021-07-04 10:49:13 | Re: Reducing the cycle time for CLOBBER_CACHE_ALWAYS buildfarm members |