Re: Improvement of pg_stat_statement usage about buffer hit ratio

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improvement of pg_stat_statement usage about buffer hit ratio
Date: 2013-11-19 03:03:00
Message-ID: CAM3SWZQpgFaCdz2+1kaRfOUP_jDwAMYX4JWFJjaE6B05xp9Kgw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 18, 2013 at 6:12 PM, KONDO Mitsumasa
<kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> I confirmed that Itagaki-san and Mr Cerdic disscution. He said that raw
> values be just simple. However, were his changes just simple? I cannot
> understand his aesthetics sense and also you, too:-(

> It's too complicated, and do you know how to tuning PG from information of
> local_* and temp_*?
> At least, I think that most user cannot tuning from these information, and
> it might not be useful information only part of them.

All of those costs are cumulative aggregates. If we didn't aggregate
them, then the user couldn't possibly determine them on their own, to
any approximation. That's the difference. If you think the local_* and
temp_* aren't very useful, I'm inclined to agree, but it's too late to
do anything about that now.

> No. It's not for geek tools and people having pre-packaged solution in big
> company, but also for common DBA tools.

I don't think that the tool needs to be expensive. If selecting from
the pg_stat_statements view every 1-3 seconds is too expensive for
such a tool, we can have a discussion about being smarter, because
there certainly are ways to optimize it.

Regarding your min/max patch: I'm opposed to adding even more to the
spinlock-protected counters struct, so that we can get an exact answer
to a question where an approximate answer would very likely be good
enough. And as Itagaki-san said 4 years ago, who is to say that what
you've done here for buffers (or equally, what you've done in your
min/max patch) is more interesting than the same thing but for another
cost? The point of having what you've removed from the
pg_stat_statements docs about calculating averages is that it is an
example that can be generalized from. I certainly think there should
be better tooling to make displaying costs over time easier, or
characterizing the distribution, but unfortunately this isn't it.

Something like pg_stat_statements is allowed to be approximate. That's
considered an appropriate trade-off. Most obviously, today there can
be hash table collisions, and some of the entries can therefore be
plain wrong. Previously, I put the probability of 1 collision in the
hash table at about 1% when pg_stat_statements.max is set to 10,000.
So if your min/max patch was "implemented in userspace", and an
outlier is lost in the noise with just one second of activity, I'm not
terribly concerned about that. It's a trade-off, and if you don't
think it's the correct one, well then I'm afraid that's just where you
and I differ. As I've said many times, if you want to have a
discussion about making aggressive snapshotting of the
pg_stat_statements view more practical, I think that would be very
useful.

> By the way, MySQL and Oracle database which are very popular have these
> statistics. I think that your argument might disturb people who wants to
> migration from these database and will accelerate popularity of these
> database more.

I think that there should be better tooling built on top of
pg_stat_statements. I don't know what Oracle does, but I'm pretty sure
that MySQL has nothing like pg_stat_statements. Please correct me if
I'm mistaken.

>> As I said on the min/max thread, if we're not
>> doing enough to help people who would like to build such a tool, we
>> should discuss how we can do better.
>
> Could you tell me how to get min/max statistics with low cost?

See my previous comments on the other thread about making
pg_stat_statements only return changed entries, and only sending the
query text once.

> I'm not sure about detail of your patch in CF, but it seems very high cost.

I think you should actually go and read the code and read my
explanation of it, and refrain from making uninformed remarks like
this. Whatever overhead my patch may add, the important point is that
it obviously and self-evidently adds exactly zero overhead to
maintaining statistics for existing entries - we only care about the
query text when first creating an entry, or when viewing an entry when
the view is selected from. With the reduced shared memory consumption,
more entries can be created, making the cost of creating new entries
(and thus whatever might have been added to that cost) matter less.
Having said that, the additional cost is thought to be very low
anyway.

If you read my mail to the list about this, you'd know that I
specifically worried about the implications of what I'd proposed for
tools like your tool. That's part of the reason I keep urging you to
think about making pg_stat_statements cheaper for consumer tools.
There is no reason to send query texts to such tools more than once
per entry, and no reason to send unchanged entries.

> Repeatedly, I think that if we want to get drastic detail statistics, we
> have to create another tools of statistics. Your patch will be these
> statistics tools. However, pg_stat_statement sholud be just light weight.

This is incomprehensible. As with the cumulative aggregate costs, how
is a consumer of pg_stat_statements possibly going to get the full
query text from anywhere else? It *has* to come from
pg_stat_statements directly.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2013-11-19 03:13:19 Re: pg_upgrade: delete_old_cluster.sh issues
Previous Message KONDO Mitsumasa 2013-11-19 02:39:51 Re: Improvement of pg_stat_statement usage about buffer hit ratio