From: | Peter Geoghegan <pg(at)heroku(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>, Rajeev rastogi <rajeev(dot)rastogi(at)huawei(dot)com>, Mitsumasa KONDO <kondo(dot)mitsumasa(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Add min and max execute statement time in pg_stat_statement |
Date: | 2014-01-30 19:28:02 |
Message-ID: | CAM3SWZSkom7mGbL4XE_pm+iguNB162M3jdfe8VyGXtWkvcrzCA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jan 30, 2014 at 9:57 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> BTW ... it occurs to me to wonder if it'd be feasible to keep the
> query-texts file mmap'd in each backend, thereby reducing the overhead
> to write a new text to about the cost of a memcpy, and eliminating the
> read cost in pg_stat_statements() altogether. It's most likely not worth
> the trouble; but if a more-realistic benchmark test shows that we actually
> have a performance issue there, that might be a way out without giving up
> the functional advantages of Peter's patch.
There could be a worst case for that scheme too, plus we'd have to
figure out how to make in work with windows, which in the case of
mmap() is not a sunk cost AFAIK. I'm skeptical of the benefit of
pursuing that.
I'm totally unimpressed with the benchmark as things stand. It relies
on keeping 64 clients in perfect lockstep as each executes 10,000
queries that are each unique snowflakes. Though even though they're
unique snowflakes, and even though there are 10,000 of them, everyone
executes the same one at exactly the same time relative to each other,
in exactly the same order as quickly as possible. Even still, the
headline "reference score" of -35% is completely misleading, because
it isn't comparing like with like in terms of has table size. This
benchmark incidentally recommends that we reduce the default hash
table size to improve performance when the hash table is under
pressure, which is ludicrous. It's completely backwards. You could
also use the benchmark to demonstrate that the overhead of calling
pg_stat_statements() is ridiculously high, since like creating a new
query text, that only requires a shared lock too.
This is an implausibly bad worst case for larger hash table sizes in
pg_stat_statements generally. 5,000 entries is enough for the large
majority of applications. But for those that hit that limit, in
practice they're still going to find the vast majority of queries
already in the table as they're executed. If they don't, they can
double or triple their "max" setting, because the shared memory
overhead is so low. No one has any additional overhead once their
query is in the hash table already. In reality, actual applications
could hardly be further from the perfectly uniform distribution of
distinct queries presented here.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2014-01-30 19:44:47 | Re: Issue with PGC_BACKEND parameters |
Previous Message | Robert Haas | 2014-01-30 19:17:59 | Re: Changeset Extraction v7.3 |