From: | David Fetter <david(at)fetter(dot)org> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: CPU costs of random_zipfian in pgbench |
Date: | 2019-02-17 17:33:45 |
Message-ID: | 20190217173344.GY10435@fetter.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Feb 17, 2019 at 11:09:27AM -0500, Tom Lane wrote:
> Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> writes:
> >> I'm trying to use random_zipfian() for benchmarking of skewed data sets,
> >> and I ran head-first into an issue with rather excessive CPU costs.
>
> > If you want skewed but not especially zipfian, use exponential which is
> > quite cheap. Also zipfian with a > 1.0 parameter does not have to compute
> > the harmonic number, so it depends in the parameter.
>
> Maybe we should drop support for parameter values < 1.0, then. The idea
> that pgbench is doing something so expensive as to require caching seems
> flat-out insane from here. That cannot be seen as anything but a foot-gun
> for unwary users. Under what circumstances would an informed user use
> that random distribution rather than another far-cheaper-to-compute one?
>
> > ... This is why I submitted a pseudo-random permutation
> > function, which alas does not get much momentum from committers.
>
> TBH, I think pgbench is now much too complex; it does not need more
> features, especially not ones that need large caveats in the docs.
> (What exactly is the point of having zipfian at all?)
Taking a statistical perspective, Zipfian distributions violate some
assumptions we make by assuming uniform distributions. This matters
because Zipf-distributed data sets are quite common in real life.
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2019-02-17 17:40:21 | Re: ON SELECT rule on a table without columns |
Previous Message | Andrew Gierth | 2019-02-17 17:09:52 | Re: Ryu floating point output patch |