| From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
|---|---|
| To: | Ants Aasma <ants(dot)aasma(at)eesti(dot)ee>, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: CPU costs of random_zipfian in pgbench |
| Date: | 2019-02-22 12:03:13 |
| Message-ID: | 1d1f5a64-82c1-5b89-dbb7-3d621198f01d@2ndquadrant.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 2/22/19 11:22 AM, Ants Aasma wrote:
> On Sun, Feb 17, 2019 at 10:52 AM Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr
> <mailto:coelho(at)cri(dot)ensmp(dot)fr>> wrote:
>
> > I'm trying to use random_zipfian() for benchmarking of skewed data
> sets,
> > and I ran head-first into an issue with rather excessive CPU costs.
> > [...] This happens because generalizedHarmonicNumber() does this:
> >
> > for (i = n; i > 1; i--)
> > ans += pow(i, -s);
> >
> > where n happens to be 1000000000 (range passed to random_zipfian), so
> > the loop takes quite a bit of time.
>
> If you find a better formula for the harmonic number, you are welcome
> and probably get your name on it:-)
>
>
> There are pretty good approximations for s > 1.0 using Riemann zeta
> function and Euler derived a formula for the s = 1 case.
>
I believe that's what random_zipfian() already uses, because for s > 1.0
it refers to "Non-Uniform Random Variate Generation" by Luc Devroye, and
the text references the zeta function. Also, I have not observed serious
issues with the s > 1.0 case (despite the docs seem to suggest there may
be some).
> I also noticed that i is int in this function, but n is int64. That
> seems like an oversight.
>
Indeed.
> Regards,
> Ants Aasma
>
>
cheers
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Pavel Stehule | 2019-02-22 12:42:37 | Re: proposal: variadic argument support for least, greatest function |
| Previous Message | Peter Eisentraut | 2019-02-22 11:38:35 | Re: unconstify equivalent for volatile |