Re: Postmaster processes running out of control?

From: "Nigel J(dot) Andrews" <nandrews(at)investsystems(dot)co(dot)uk>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Postmaster processes running out of control?
Date: 2002-03-22 14:09:16
Message-ID: Pine.LNX.4.21.0203221346350.6141-100000@ponder.fairway2k.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


On Fri, 22 Mar 2002, Martijn van Oosterhout wrote:

> On Fri, Mar 22, 2002 at 01:11:11AM +0000, Nigel J. Andrews wrote:
> > Good point, I hadn't tried it since the upgrade becuase that wasn't why I
> > upgraded (don't worry I've got a _long_ post on that subject waiting to be
> > sent), ...
> > [sniped]
> > ...
> > the EXPLAIN output because it's not interesting and it would almost be
> > an overlap with the contents of my long, pending post.
>
> Hmm, with something that matches most of the table, it would likely choose a
> sequential scan which can take quite a while over a large table

Yep, doing a seqscan does take a while and I can see why it does one most of
the time. This is the subject of my long, pending post, which is no longer
pending as I have recently sent it to the list.

> >[sniped]
> >
> If you're selecting on one column (poster_name) and sorting on another
> (time) it may help to have an index on both (time,poster_name) since that
> avoids the sort step. (I hope 7.2 estimates sort costs better than earlier
> versions).

Thank you, it didn't occur to me that the two column index would ease the
sorting. Would the order of the columns specified in the index creation be
significant?

> However, if you really want the whole output (rather than say the first 100
> lines) and that really involves trawling a majority of the table, then you
> are simply bound by disk transfer speed.

I have coded the application to use all the data, caching the pages (it's a web
site) not requested but which use the results of the query, eg. pages 2 to xxx
of the results when page 1 has been requested by the user. I'm happy this way
since the query hit is taken only once instead of several dozen times and I
have a fancy caching system that fits in nicely with data update frequency. In
fact, that was one of the driving forces behind my fancy caching system. The
I/O is a definitely a bottleneck. There's almost continuous disk activity with
a change in it's 'character' marking the switch from backend query processing
to page cache filling.

>
> Alternativly, using a cursor may allow you to start outputting data before
> the query has finished.

At the moment the application just issues the query and waits for the results.
I had considered changing this more in order to reduce the memory usage within
the client than as a speeding up means. I think such a change makes the split
between content and presentation harder to maintain though, possible but
harder, or shall we say more sophisticated.

--
Nigel J. Andrews
Logictree Systems Limited

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Maarten.Boekhold 2002-03-22 14:34:41 Have you seen these hackers: Uncle George
Previous Message Darren Ferguson 2002-03-22 14:02:36 Re: How to perform an identical insert?