From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Alik Khilazhev <a(dot)khilazhev(at)postgrespro(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [WIP] Zipfian distribution in pgbench |
Date: | 2017-07-07 18:19:44 |
Message-ID: | CAH2-WznMsUdEvQbpHjox+Gqjz8bn7DgxkEjUGp6iL-0O6UTzGg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jul 7, 2017 at 5:17 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> How is that possible? In a Zipfian distribution, no matter how big
> the table is, almost all of the updates will be concentrated on a
> handful of rows - and updates to any given row are necessarily
> serialized, or so I would think. Maybe MongoDB can be fast there
> since there are no transactions, so it can just lock the row slam in
> the new value and unlock the row, all (I suppose) without writing WAL
> or doing anything hard.
If you're not using the Wired Tiger storage engine, than the locking
is at the document level, which means that a Zipfian distribution is
no worse than any other as far as lock contention goes. That's one
possible explanation. Another is that indexed organized tables
naturally have much better locality, which matters at every level of
the memory hierarchy.
> I'm more curious about why we're performing badly than I am about a
> general-purpose random_zipfian function. :-)
I'm interested in both. I think that a random_zipfian function would
be quite helpful for modeling certain kinds of performance problems,
like CPU cache misses incurred at the page level.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2017-07-07 18:53:13 | Re: [WIP] Zipfian distribution in pgbench |
Previous Message | Wong, Yi Wen | 2017-07-07 17:19:41 | replication_slot_catalog_xmin not explicitly initialized when creating procArray |