From: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
---|---|
To: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
Cc: | Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: rand48 replacement |
Date: | 2021-07-08 08:26:23 |
Message-ID: | alpine.DEB.2.22.394.2107072052270.4160419@pseudo |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello Dean,
> Whilst it has been interesting learning and discussing all these
> different techniques, I think it's probably best to stick with the
> bitmask method, rather than making the code too complex and difficult
> to follow.
Yes.
> The bitmask method has the advantage of being very simple, easy to
> understand and fast (fastest in many of the benchmarks, and close enough
> in others to make me think that the difference won't matter for our
> purposes).
>
> To test the current patch, I hacked up a simple SQL-callable server
> function: random(bigint, bigint) returns bigint, similar to the one in
> pgbench. After doing so, I couldn't help thinking that it would be
> useful to have such a function in core, so maybe that could be a
> follow-on patch.
Yep.
> Anyway, that led to the following observations:
>
> Firstly, there's a bug in the existing mask_u64() code -- if
> pg_leftmost_one_pos64(u) returns 63, you end up with a mask equal to
> 0, and it breaks down.
Oops:-(
> Secondly, I think it would be simpler to implement this as a bitshift,
> rather than a bitmask, using the high bits from the random number. That
> might not make much difference for xoroshiro**, but in general, PRNGs
> tend to be weaker in the lower bits, so it seems preferable on that
> basis. But also, it makes the code simpler and less error-prone.
Indeed, that looks like a good option.
> Finally, I think it would be better to treat the upper bound of the
> range as inclusive.
This bothered me as well, but the usual approach seems to use range as the
number of values, so I was hesitant to depart from that. I'm still
hesitant to go that way.
> Doing so makes the function able to cover all
> possible 64-bit ranges. It would then be easy (perhaps in another
> follow-on patch) to make the pgbench random() function work for all
> 64-bit bounds (as long as max >= min), without the weird overflow
> checking it currently has.
>
> Putting those 3 things together, the code (minus comments) becomes:
>
> if (range > 0)
> {
> int rshift = 63 - pg_leftmost_one_pos64(range);
>
> do
> {
> val = xoroshiro128ss(state) >> rshift;
> }
> while (val > range);
> }
> else
> val = 0;
>
> which reduces the complexity a bit.
Indeed.
Attached v9 follows this approach but for the range being inclusive, as
most sources I found understand the range as the number of values.
--
Fabien.
Attachment | Content-Type | Size |
---|---|---|
prng-9.patch | text/x-diff | 47.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro Horiguchi | 2021-07-08 08:30:23 | Re: Incorrect usage of strtol, atoi for non-numeric junk inputs |
Previous Message | Dean Rasheed | 2021-07-08 08:23:03 | Re: [PATCH] expand the units that pg_size_pretty supports on output |