Re: Increasing the length of pg_stat_activity.current_query...

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Jan Wieck <JanWieck(at)Yahoo(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Increasing the length of pg_stat_activity.current_query...
Date: 2004-11-11 04:55:20
Message-ID: 87zn1pf0hz.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Simon Riggs <simon(at)2ndquadrant(dot)com> writes:

> My intention was towards a data warehouse situation, and my comments are
> only relevant in that context. Possibly 25+% of the user base use this
> style of processing. In that case, I expect queries to run for minutes
> or hours.

I come from the opposite angle but have also ended up with the same
conclusion.

In an OLTP environment you can't be trying to save every single SQL query in
the log file. And saving only queries that take longer than some arbitrary
amount of time might not be capturing enough to give a good picture of what's
going on.

I like the idea of a stats daemon that's isolated from the server by something
like UDP and keeps statistics. It would let me turn off logging while still
being able to peek into what queries are running, which take the longest,
which are being executed the most often, and which are taking the most
cumulative time (which isn't necessarily the same thing as either of the other
two).

The idea of tracking cache misses is great, though in the current design a
postgres buffer cache miss doesn't necessarily mean a cache miss. If Postgres
moves to O_DIRECT then it would be a valuable statistic, or if instrumentation
to test for timing of cache hits and misses is added then it could be a good
statistic to have.

I can say that with Oracle it was *incredibly* useful to have the queries
being executed and cached queryable in the cache. The ora_explain tool that
comes with DBD::Oracle makes it extremely easy to identify queries consuming
resources, experiment with rewrites, and then copy the new query into the
application.

It would be great to have something equivalent for Postgres. It would be
extremely kludgy by comparison to have to dig through the log files for
queries. Much better would be to have an interface to access the data pgstats
gathers. But that only works if the entire query is there.

--
greg

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Kirkwood 2004-11-11 04:57:42 Re: MAX/MIN optimization via rewrite (plus query rewrites
Previous Message Greg Stark 2004-11-11 04:45:52 Re: A modest proposal: get rid of GUC's USERLIMIT variable category