From: | KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Cc: | Gregory Smith <gregsmithpgsql(at)gmail(dot)com>, Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>, Peter Geoghegan <pg(at)heroku(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
Subject: | Re: gaussian distribution pgbench |
Date: | 2014-01-14 06:52:17 |
Message-ID: | 52D4DEA1.3050900@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
I revise my gaussian pgbench patch which wss requested from community.
* Changes
- Support custom script.
- "\setgaussian" is generating gaussian distribute random number.
- ex) \setgaussian [var] [min] [max] [stddev_threshold]
- We can use mixture model in multiple custom scripts.
- Delete short option "-g", and add long options ”--gaussian"
- Refactoring getrand() interface
> - getrand(TState *thread, int64 min, int64 max)
> + getrand(TState *thread, int64 min, int64 max, DistType dist_type, double value1)
- We can easy to add other random distribution algorithms. Please see detail
design in attached patch.
Febien COELHO wrote:
>> From a probabilistic point of view, it seems to me that a randomized
> (discretized) exponential would be more significant to model a server load.
>
> \setexp var min max lambda...
I can create randomized exponential distribution under following. It is very easy.
double rand_exp( double lambda ){
return -log(Uniform(0,1))/lambda;
}
If community wants this, I will add this function in my patch.
Gavin Flower wrote:
> Curious, wouldn't the common usage pattern tend to favour a skewed distribution,
> such as the Poisson Distribution (it has been over 40 years since I studied
> this area, so there may be better candidates).
The difference between Poisson distribution and Gaussian distribution is discrete
or not.
In my gaussian algorithm, first generating continuos gaussian distribution, next
projection to integer values which are each record, it will be discrete value.
Therefore, it will be almost simular with Poisson distribution. And when we set
larger standard deviations(higher 10), it will be created better approximation of
Poisson distribution.
Attached sql files are for custom scripts which are different distribution. It
realize mixture distribuion benchmark. And attached graph is the result.
[example command]
$pgbench -f file1.sql file2.sql
If you have more some comment, please send me.
Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
gaussian_pgbench_v3.patch | text/x-diff | 16.9 KB |
file1.sql | text/plain | 347 bytes |
file2.sql | text/plain | 575 bytes |
image/png | 4.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Jeevan Chalke | 2014-01-14 06:52:30 | Re: [PATCH] Filter error log statements by sqlstate |
Previous Message | Amit Kapila | 2014-01-14 06:16:11 | Re: Performance Improvement by reducing WAL for Update Operation |