Re: Help interpreting pg_stat_bgwriter output

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: sam mulube <sam(dot)mulube(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Help interpreting pg_stat_bgwriter output
Date: 2009-08-13 08:00:06
Message-ID: alpine.GSO.2.01.0908130343590.13251@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, 12 Aug 2009, sam mulube wrote:

>  is my interpreting of buffers_clean = 0 correct?

Yes.

>  If so, why would the bgwriter not be writing out any buffers?

The purpose of the cleaner is to prepare buffers that we expect will be
needed for allocations in the near future. Let's do a little math on your
system to guess why that's not happening.

> checkpoints_timed = 333
> checkpoints_req = 0

You're never triggering checkpoints from activity. This suggests that
your system is having a regular checkpoint every 5 minutes, and therefore
the time your server has been up is about 1665 minutes.

> bgwriter_delay = 200ms

With the background writer running 5 times per second, the data you've
sampled involved it running 1665 * 60 * 5 = 499500 times. During none of
those runs did it actually write anything; why?

> buffers_alloc = 19163

During those runs, 19163 buffers were allocated. This means that during
the average background writer delay nap, 19163 / 499500 = 0.04 buffers
were allocated. That's very little demand for buffers that need to be
cleaned on average, and the evidence here suggests the system is finding
plenty of cleaned up and ready to go buffers from the background
checkpoint process. It doesn't need to do any work on top of what the
checkpoint buffer cleanup is doing.

> buffers_backend = 740

This number represents the behavior the background writer is trying to
prevent--backends having to clean their own buffers up. Your result here
suggests that on average, during any 5 minute period there are 740 / 333 =
2.2 buffers being written that we might have had the background writer
take care of instead. Again, that's so little activity that the averages
the background writer estimates with aren't even detecting anything worth
doing.

In short, your system isn't nearly active enough for the background writer
to find itself with useful work to do, and one of the design goals for it
was to keep it from spinning around doing nothing in that situation. If
your system load goes up, I expect you'll discover cleaning starts
happening too.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD
>From pgsql-general-owner(at)postgresql(dot)org Thu Aug 13 05:43:52 2009
Received: from localhost (unknown [200.46.208.211])
by mail.postgresql.org (Postfix) with ESMTP id 4164163548D
for <pgsql-general-postgresql(dot)org(at)mail(dot)postgresql(dot)org>; Thu, 13 Aug 2009 05:43:52 -0300 (ADT)
Received: from mail.postgresql.org ([200.46.204.86])
by localhost (mx1.hub.org [200.46.208.211]) (amavisd-maia, port 10024)
with ESMTP id 79574-02-2
for <pgsql-general-postgresql(dot)org(at)mail(dot)postgresql(dot)org>;
Thu, 13 Aug 2009 08:43:37 +0000 (UTC)
X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6
Received: from web24601.mail.ird.yahoo.com (web24601.mail.ird.yahoo.com [212.82.104.158])
by mail.postgresql.org (Postfix) with SMTP id 61784634DBF
for <pgsql-general(at)postgresql(dot)org>; Thu, 13 Aug 2009 05:42:57 -0300 (ADT)
Received: (qmail 70396 invoked by uid 60001); 13 Aug 2009 08:42:55 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.it; s=s1024; t=1250152975; bh=ULAdFh5KXFNNecb2fakmWVO9w0UtjKl2bH/BiUOmOZ8=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=25ZpmosKyw+xlv3f2inUKQV+g7D3ZiSFumejsTImgwq475XSsUPoaUnBI8p7UgBjtBnvbUHLKUgaxQYVOANp5UafpDBdlrRZox0ScZz4hVdmLerch72XS9sCBEGVt4mgz3ONJY3jLe5lgiWLhc+xz2MIIGCevyvz1QnST6pHix0=
DomainKey-Signature:a=rsa-sha1; q=dns; c=nofws;
s=s1024; d=yahoo.it;
h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding;
b=iKDPTl2a+a6B8NXl7y98MlOWoygbReVTRKNduUtwDs8girMxhtqdiq14TdvZenJY7bFLDWB1O5AjZmtMpug84ovE6QvZJ72LagERY2EdwQ60zRhCqq0XPnmY6f+aTHqPFgtXyYmqSTQcBGUC7VU9j3ks1tEB14j5PoY+ugPClK8=;
Message-ID: <292963(dot)52054(dot)qm(at)web24601(dot)mail(dot)ird(dot)yahoo(dot)com>
X-YMail-OSG: EAIBiBQVM1lCdSTOETgPhtRL06YDsza3hETn7GCX.sKMND9CCZ_tA9Zwfc95fMJx83FGFEMR2ovh0P3CPUURp6aXpXAMK7t39miwENVgMZoEpyJOiJVhbkbMKQlVDzTIjaqkXnyEMXiIqT6vqpiyZyAsKLhDqd94908TfpXm27lTr61TKQws3OfaDMj6qmaGnouM6kQJwC5RPrv3VqsBH65EiU1kjzCAVcwX4JGa90_jra5HxCLOURN6FGnDvUAMV3XHycMOZyb0aUKSYN1cd4YHEv84LIZpU5cfNvtMbsu5hl3MZ4o-
Received: from [87.10.126.161] by web24601.mail.ird.yahoo.com via HTTP; Thu, 13 Aug 2009 08:42:55 GMT
X-Mailer: YahooMailClassic/6.1.2 YahooMailWebService/0.7.338.2
Date: Thu, 13 Aug 2009 08:42:55 +0000 (GMT)
From: Scara Maccai <m_lists(at)yahoo(dot)it>
Subject: Re: totally different plan when using partitions
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
In-Reply-To: <4A83C286(dot)6090205(at)archonet(dot)com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
X-Virus-Scanned: Maia Mailguard 1.0.1
X-Spam-Status: No, hits=-2.036 tagged_above=-10 required=5 tests=AWL=0.407,
BAYES_00=-2.599, SUBJECT_FUZZY_TION=0.156
X-Spam-Level:
X-Archive-Number: 200908/425
X-Sequence-Number: 151626

Thank you for your reply.
This makes partitions unusable for me... hope someone explains why this hap=
pens... this still looks like a bug to me...
BTW the problem arises when adding the second "left outer join": when using=
only 1 partitioned table (that is, only 1 "left outer join") the 2 plans a=
re identical.

My answers follow.

> Well, the first version was expecting about 400 rows, the
> second 15000, so it's not surprising that they have
> different plans.

Well, they're using exactly the same tables; I don't understand why one exp=
ects 400 rows and the other 15000....

> I'm not sure whether the planner is smart enough to cope
> with the multiple tests on time vs the partitioning and
> realise it can use your index on the partition.

Sorry, didn't understand that...

> I'm assuming the partition is defined as being "between
> '2006-10-01 00:00:00' and '2006-10-06 00:00:00'" - strikes
> me as an odd interval, and you don't say anywhere.

Data is partitioned on a "4 tables per month" basis

> You _might_ have a better plan by moving the partitioned
> tests into subqueries:
> FROM cell_bsc_60_0610 as cell_bsc
> left outer join (
> =A0 SELECT ne_id FROM teststscell73 WHEREtime between
> '2006-10-01 00:00:00' and '2006-10-06 00:00:00') as data on
> data.ne_id=3Dcell_bsc.nome1

Tried that, no changes.

=0A=0A=0A

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Richard Huxton 2009-08-13 09:22:24 Re: totally different plan when using partitions
Previous Message Richard Huxton 2009-08-13 07:36:38 Re: totally different plan when using partitions