Re: Performance monitor signal handler

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jan Wieck <JanWieck(at)Yahoo(dot)com>, PostgreSQL HACKERS <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance monitor signal handler
Date: 2001-03-19 18:04:37
Message-ID: 200103191804.NAA20163@jupiter.jw.home
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Bruce Momjian wrote:
> > Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > > Only shared memory gives us near-zero cost for write/read. 99% of
> > > backends will not be using stats, so it has to be cheap.
> >
> > Not with a circular buffer it's not cheap, because you need interlocking
> > on writes. Your claim that you can get away without that is simply
> > false. You won't just get lost messages, you'll get corrupted messages.
>
> How do I get corrupt messages if they are all five bytes? If I write
> five bytes, and another does the same, I guess the assembler could
> intersperse the writes so the oid gets to be a corrupt value. Any cheap
> way around this, perhaps by skiping/clearing the write on a collision?
>
> >
> > > The collector program can read the shared memory stats and keep hashed
> > > values of accumulated stats. It uses the "Loops" variable to know if it
> > > has read the current information in the buffer.
> >
> > And how does it sleep until the counter has been advanced? Seems to me
> > it has to busy-wait (bad) or sleep (worse; if the minimum sleep delay
> > is 10 ms then it's guaranteed to miss a lot of data under load).
>
> I figured it could just wake up every few seconds and check. It will
> remember the loop counter and current pointer, and read any new
> information. I was thinking of a 20k buffer, which could cover about 4k
> events.

Here I wonder what your EVENT is. With an Oid as identifier
and a 1 byte (even if it'd be anoter 32-bit value), how many
messages do you want to generate to get these statistics:

- Number of sequential scans done per table.
- Number of tuples returned via sequential scans per table.
- Number of buffer cache lookups done through sequential
scans per table.
- Number of buffer cache hits for sequential scans per
table.
- Number of tuples inserted per table.
- Number of tuples updated per table.
- Number of tuples deleted per table.
- Number of index scans done per index.
- Number of index tuples returned per index.
- Number of buffer cache lookups done due to scans per
index.
- Number of buffer cache hits per index.
- Number of valid heap tuples returned via index scan per
index.
- Number of buffer cache lookups done for heap fetches via
index scan per index.
- Number of buffer cache hits for heap fetches via index
scan per index.
- Number of buffer cache lookups not accountable for any of
the above.
- Number of buffer cache hits not accountable for any of
the above.

What I see is that there's a difference in what we two want
to see in the statistics. You're talking about looking at the
actual querystring and such. That's information useful for
someone actually looking at a server, to see what a
particular backend is doing. On my notebook a parallel
regression test (containing >4,000 queries) passes by under
1:30, that's more than 40 queries per second. So that doesn't
tell me much.

What I'm after is to collect the above data over a week or so
and then generate a report to identify the hot spots of the
schema. Which tables/indices cause the most disk I/O, what's
the average percentage of tuples returned in scans (not from
the query, I mean from the single scan inside of the joins).
That's the information I need to know where to look for
possibly better qualifications, useless indices that aren't
worth to maintain and the like.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2001-03-19 18:10:16 Re: Performance monitor signal handler
Previous Message Tom Lane 2001-03-19 17:27:50 Re: AW: AW: Re[4]: Allowing WAL fsync to be done via O_SYNC