Quick Links

Re: Stats collector performance improvement

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Qingqing Zhou <zhouqq(at)cs(dot)toronto(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Stats collector performance improvement
Date:	2006-01-03 09:40:53
Message-ID:	1136281253.5052.113.camel@localhost.localdomain
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-patches pgsql-performance

On Mon, 2006-01-02 at 16:48 -0500, Tom Lane wrote:

> The two compromises that were made in the original stats design to make
> it fast were (1) stats updates lag behind reality, and (2) some updates
> may be missed entirely. Now that we have a couple of years' field
> experience with the code, it seems that (1) is acceptable for real usage
> but (2) not so much.

We decided that the stats update had to occur during execution, in case
the statement aborted and row versions were not notified. That means we
must notify things as they happen, yet could use a reliable queuing
system that could suffer a delay in the stats becoming available.

But how often do we lose a backend? Could we simply buffer that a little
better? i.e. don't send message to stats unless we have altered at least
10 rows? So we would buffer based upon the importance of the message,
not the actual size of the message. That way singleton-statements won't
generate the same stats traffic, but we risk losing a buffers worth of
row changes should we crash - everything would still work if we lost a
few small row change notifications.

We can also save lots of cycles on the current statement overhead, which
is currently the worst part of the stats, performance-wise. That
definitely needs redesign. AFAICS we only ever need to know the SQL
statement via the stats system if the statement has been running for
more than a few minutes - the main use case is for an admin to be able
to diagnose a rogue or hung statement. Pushing the statement to stats
every time is just a big overhead. That suggests we should either have a
pull or a deferred push (longer-than-X-secs) approach.

Best Regards, Simon Riggs

In response to

Re: Stats collector performance improvement at 2006-01-02 21:48:45 from Tom Lane

Responses

Re: Stats collector performance improvement at 2006-01-03 16:35:56 from Jim C. Nasby
Re: Stats collector performance improvement at 2006-01-03 21:42:53 from Hannu Krosing

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Simon Riggs	2006-01-03 09:54:57	Re: Stats collector performance improvement
Previous Message	Magnus Hagander	2006-01-03 09:31:30	Re: psql & readline & win32

Browse pgsql-patches by date

	From	Date	Subject
Next Message	Simon Riggs	2006-01-03 09:54:57	Re: Stats collector performance improvement
Previous Message	Joe Conway	2006-01-03 05:09:29	Re: [BUGS] BUG #2129: dblink problem

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Simon Riggs	2006-01-03 09:54:57	Re: Stats collector performance improvement
Previous Message	Jan Wieck	2006-01-03 04:06:57	Re: Stats collector performance improvement