Re: CPU costs of random_zipfian in pgbench

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CPU costs of random_zipfian in pgbench
Date: 2019-02-17 22:41:10
Message-ID: CAH2-WznOikUWYO7PKO0m0=YC39iXV5mFEaFx65Oma-DUtQW3rA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 17, 2019 at 8:09 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> TBH, I think pgbench is now much too complex; it does not need more
> features, especially not ones that need large caveats in the docs.
> (What exactly is the point of having zipfian at all?)

I agree that pgbench is too complex, given its mandate and design.
While I found Zipfian useful once or twice, I probably would have done
just as well with an exponential distribution.

I have been using BenchmarkSQL as a fair-use TPC-C implementation for
my indexing project, with great results. pgbench just isn't very
useful when validating the changes to B-Tree page splits that I
propose, because the insertion pattern cannot be modeled
probabilistically. Besides, I really think that things like latency
graphs are table stakes for this kind of work, which BenchmarkSQL
offers out of the box. It isn't practical to make pgbench into a
framework, which is what I'd really like to see. There just isn't that
much more than can be done there.

BenchmarkSQL seems to have recently become abandonware, though it's
abandonware that I rely on. OpenSCG were acquired by Amazon, and the
Bitbucket repository vanished without explanation. I would be very
pleased if Fabien or somebody else considered maintaining it going
forward -- it still has a lot of rough edges, and still could stand to
be improved in a number of ways (I know that Fabien is interested in
both indexing and benchmarking, which is why I thought of him). TPC-C
is a highly studied benchmark, and I'm sure that there is more we can
learn from it. I maintain a mirror of BenchmarkSQL here:

https://github.com/petergeoghegan/benchmarksql/

There are at least 2 or 3 other fair-use implementations of TPC-C that
work with Postgres that I'm aware of, all of which seem to have
several major problems. BenchmarkSQL is a solid basis for an
externally maintained, defacto standard Postgres benchmarking tool
that comes with "batteries included".

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-02-17 22:43:59 Re: Ryu floating point output patch
Previous Message Justin Pryzby 2019-02-17 22:36:37 Re: REL_11_STABLE: dsm.c - cannot unpin a segment that is not pinned