From: | Mitsumasa KONDO <kondo(dot)mitsumasa(at)gmail(dot)com> |
---|---|
To: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: gaussian distribution pgbench -- splits v4 |
Date: | 2014-08-01 07:58:01 |
Message-ID: | CADupcHWoU=L+1zPY+WfxXH=VoofeZkMGVhDJpWD4nWiW2H8oSQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
2014-08-01 16:26 GMT+09:00 Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
>
> Maybe somebody who knows more math than I do (like you, probably!) can
>> come up with something more clever.
>>
>
> I can certainly suggest other formula, but that does not mean beautiful
> code, thus would probably be rejected. I'll see.
>
> An alternative to this whole process may be to hash/modulo a non uniform
> random value.
>
> id = 1 + hash(some-random()) % n
>
> But the hashing changes the distribution as it adds collisions, so I have
> to think about how to be able to control the distribution in that case, and
> what hash function to use.
I think that we have to consider and select reproducible method, because
benchmark is always needed robust and reproducible result. And if we
realize this idea, we might need more accurate random generator that is
like Mersenne twister algorithm. erand48 algorithm is slow and not
accurate very much.
By the way, I don't know relativeness of this topic and command line
option... Well whatever...
Regards,
--
Mitsumasa KONDO
From | Date | Subject | |
---|---|---|---|
Next Message | Anastasia Lubennikova | 2014-08-01 07:58:38 | Index-only scans for GIST |
Previous Message | Fabien COELHO | 2014-08-01 07:26:53 | Re: gaussian distribution pgbench -- splits v4 |