From: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
---|---|
To: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: rand48 replacement |
Date: | 2021-07-02 21:51:42 |
Message-ID: | alpine.DEB.2.22.394.2107022340250.2359988@pseudo |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello Dean,
> It may be true that the bias is of the same magnitude as FP multiply,
> but it is not of the same nature. With FP multiply, the
> more-likely-to-be-chosen values are more-or-less evenly distributed
> across the range, whereas modulo concentrates them all at one end,
> making it more likely to bias test results.
Yes, that is true.
> It's worth paying attention to how other languages/libraries implement
> this, and basically no one chooses the modulo method, which ought to
> raise alarm bells. Of the biased methods, it has the worst kind of
> bias and the worst performance.
Hmmm. That is not exactly how I interpreted the figures in the blog.
> If a biased method is OK, then the biased integer multiply method
> seems to be the clear winner. This requires the high part of a
> 64x64-bit product, which is trivial if 128-bit integers are available,
> but would need a little more work otherwise. There's some code in
> common/d2s that might be suitable.
And yes, modulo is expensive. If we allow 128 bits integers operations, I
would not choose this RNPG in the first place, I'd take PCG with a 128 bits state.
That does not change the discussion about bias, though.
> Most other implementations tend to use an unbiased method though, and I
> think it's worth doing the same. It might be a bit slower, or even
> faster depending on implementation and platform, but in the context of
> the DB as a whole, I don't think a few extra cycles matters either way.
Ok ok ok, I surrender!
> The method recommended at the very end of that blog seems to be pretty
> good all round, but any of the other commonly used unbiased methods
> would probably be OK too.
That does not address my other issues with the proposed methods, in
particular the fact that the generated sequence is less deterministic, but
I think I have a simple way around that. I'm hesitating to skip to the the
bitmask method, and give up performance uniformity. I'll try to come up
with something over the week-end, and also address Tom's comments in
passing.
--
Fabien.
From | Date | Subject | |
---|---|---|---|
Next Message | Jacob Champion | 2021-07-02 22:15:45 | Re: Preventing abort() and exit() calls in libpq |
Previous Message | Paul A Jungwirth | 2021-07-02 21:39:50 | Re: SQL:2011 application time |