From: | KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: gaussian distribution pgbench |
Date: | 2014-03-17 05:39:44 |
Message-ID: | 53268AA0.2040909@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
(2014/03/15 15:53), Fabien COELHO wrote:
>
> Hello Heikki,
>
>> A couple of comments:
>>
>> * There should be an explicit "\setrandom ... uniform" option too, even though
>> you get that implicitly if you don't specify the distribution
>
> Indeed. I agree. I suggested it, but it got lost.
>
>> * What exactly does the "threshold" mean? The docs informally explain that "the
>> larger the thresold, the more frequent values close to the middle of the
>> interval are drawn", but that's pretty vague.
>
> There are explanations and computations as comments in the code. If it is about
> the documentation, I'm not sure that a very precise mathematical definition will
> help a lot of people, and might rather hinder understanding, so the doc focuses
> on an intuitive explanation instead.
>
>> * Does min and max really make sense for gaussian and exponential
>> distributions? For gaussian, I would expect mean and standard deviation as the
>> parameters, not min/max/threshold.
>
> Yes... and no:-) The aim is to draw an integer primary key from a table, so it
> must be in a specified range. This is approximated by drawing a double value with
> the expected distribution (gaussian or exponential) and project it carefully onto
> integers. If it is out of range, there is a loop and another value is drawn. The
> minimal threshold constraint (2.0) ensures that the probability of looping is low.
>
>> * How about setting the variable as a float instead of integer? Would seem more
>> natural to me. At least as an option.
>
> Which variable? The values set by setrandom are mostly used for primary keys. We
> really want integers in a range.
Oh, I see. He said about documents.
+ Moreover, set gaussian or exponential with threshold interger value,
+ we can get gaussian or exponential random in integer value between
+ <replaceable>min</> and <replaceable>max</> bounds inclusive.
Collectry,
+ Moreover, set gaussian or exponential with threshold double value,
+ we can get gaussian or exponential random in integer value between
+ <replaceable>min</> and <replaceable>max</> bounds inclusive.
And I am going to fix the document more easily understanding for user.
Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Joshua Yanovski | 2014-03-17 07:14:49 | Re: [WIP] Better partial index-only scans |
Previous Message | David Johnston | 2014-03-17 05:15:31 | Re: BUG #9578: Undocumented behaviour for temp tables created inside query language (SQL) functions |