From: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
---|---|
To: | Bruce Momjian <bruce(at)momjian(dot)us> |
Cc: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Steele <david(at)pgmasters(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Hironobu SUZUKI <hironobu(at)interdb(dot)jp>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: pgbench - add pseudo-random permutation function |
Date: | 2021-03-11 16:31:44 |
Message-ID: | CAEZATCWa1P7hVwJCxCDEDOFvVERO2J51XBz_6gbsozR_ton1fw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, 11 Mar 2021 at 00:58, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>
> Maybe Dean Rasheed can help because of his math background --- CC'ing him.
>
Reading the thread I can see how such a function might be useful to
scatter non-uniformly random values.
The implementation looks plausible too, though it adds quite a large
amount of new code. The main thing that concerns me is justifying the
code. With this kind of thing, it's all too easy to overlook corner
cases and end up with trivial sequences in certain special cases. I'd
feel better about that if we were implementing a standard algorithm
with known pedigree.
Thinking about the use case for this, it seems that it's basically
designed to turn a set of non-uniform random numbers (produced by
random_exponential() et al.) into another set of non-uniform random
numbers, where the non-uniformity is scattered so that the more/less
common values aren't all clumped together.
I'm wondering if that's something that can't be done more simply by
passing the non-uniform random numbers through the uniform random
number generator to scatter them uniformly across some range -- e.g.,
given an integer n, return the n'th value from the sequence produced
by random(), starting from some initial seed -- i.e., implement
nth_random(lb, ub, seed, n). That would actually be pretty
straightforward to implement using O(log(n)) time to execute (see the
attached python example), though it wouldn't generate a permutation,
so it'd need a bit of thought to see if it met the requirements.
Regards,
Dean
Attachment | Content-Type | Size |
---|---|---|
erand48.py | text/x-python | 926 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2021-03-11 17:01:56 | Re: Huge memory consumption on partitioned table with FKs |
Previous Message | Robert Haas | 2021-03-11 16:31:32 | Re: New IndexAM API controlling index vacuum strategies |