From: | Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tv(at)fuzzy(dot)cz> |
Cc: | "Gaëtan Allart" <gaetan(at)nexylan(dot)com>, Robert Treat <rob(at)xzilla(dot)net>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: General performance/load issue |
Date: | 2011-11-25 11:43:12 |
Message-ID: | CAF6yO=3j2g4zimiad07=hW8CSKpe7OHDrd80jT+rEjUcytXFSw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Le 25 novembre 2011 11:25, Tomas Vondra <tv(at)fuzzy(dot)cz> a écrit :
> On 24 Listopad 2011, 23:19, Cédric Villemain wrote:
>> Le 24 novembre 2011 20:38, Gaëtan Allart <gaetan(at)nexylan(dot)com> a écrit :
>>> Finally, it "crashed" againŠ :-(
>>>
>>> Here's the output of iotop while databased was inaccessible :
>>>
>>> 32361 be/4 postgres 0.00 B/s 0.00 B/s 0.00 % 99.99 % postgres:
>>> mydb
>>> mydb host(34847) idle
>>> 32244 be/4 postgres 163.48 K/s 0.00 B/s 0.00 % 99.99 % postgres:
>>> mydb
>>> mydb host(34660) SELECT
>>> 32045 be/4 postgres 7.78 K/s 0.00 B/s 0.00 % 99.99 % postgres:
>>> mydb
>>> mydb host(33765) SELECT
>>> 32158 be/4 postgres 7.78 K/s 0.00 B/s 0.00 % 99.99 % postgres:
>>> mydb
>>> mydb host(34112) SELECT
>>> 32242 be/4 postgres 7.78 K/s 0.00 B/s 0.00 % 99.99 % postgres:
>>> mydb
>>> mydb host(34632) SELECT
>>> 32372 be/4 postgres 0.00 B/s 0.00 B/s 0.00 % 99.99 % postgres:
>>> mydb
>>> mydb host(38858) idle in transaction
>>> 32231 be/4 postgres 15.57 K/s 0.00 B/s 0.00 % 99.99 % postgres:
>>> mydb
>>> mydb host(38602) SELECT
>>> 28811 be/4 postgres 3.89 K/s 0.00 B/s 0.00 % 99.99 % postgres:
>>> mydb
>>> mydb host(40594) SELECT
>>> 32190 be/4 postgres 3.89 K/s 0.00 B/s 0.00 % 99.99 % postgres:
>>> mydb
>>> mydb host(38497) SELECT
>
> This iotop output is rather incomplete - it does not show which of the
> processes actually did the I/O. The fact that a process does 99% of the
> I/O and yet performs no reads or writes is a bit strange.
>
> Use "iotop -o" and post the result.
>
>>> Does this help?
>>
>> yes.
>> It seem you have an issue with your checkpoint syncing time, it is
>> fixed in 9.1 and backported in 9.0 here :
>> http://projects.2ndquadrant.com/backports
>
> People generally don't want to apply backports on their own, especially
> when it's a production server and when it's unclear it actually fixes the
> issue they have. I'm not sure about that.
I agree that most people don't want to do that themselves, but if it
happens to be the solution they can proceed or ask someone to do it.
People want to see their production system back to a normal situation,
here the limited information are not enought to be sure, but the
checkpoint sync time are clear: sync time are not correct.
It is very probable that compacting the fsync will help, but it is not
sure it is required yet.
>
>> It is possible you have other problems that explains the issue you
>> have. An immediate solution before trying a patch is to reduce your
>> shared_buffer setting to something very low, like 1GB.
>
> Well, using low shared_buffers was used especially before 8.3, when the
> spread checkpoints were not available. It prevents the I/O overload when
> the database suddenly decides to write all of the dirty buffers. But he's
> on 9.0 (so he already has spread checkpoints).
It is a different animal here.
>
> Plus the number of buffers he's writing is negligible - usually about 700
> buffers (6MB), 3192 buffers (25MB) at most. That surely should not be a
> problem for the SSD he's using.
See the blog entry from Greg Smith:
http://blog.2ndquadrant.com/en/2011/06/backporting-and-checkpoint-tro.html
And the slides of his talk at pgconf2011:
http://www.2ndquadrant.com/static/2quad/media/pdfs/talks/WriteStuff-PGCon2011.pdf
I was just pointing that there are known issues in this area, with
known solutions.
Getting more information on vacuum activity, bgwriter activity should help too.
Gaëtan, do you still have critical performance issue, or is it back to
normal/loaded (but not overloaded) situation ?
--
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2011-11-25 13:12:32 | Re: General performance/load issue |
Previous Message | Tomas Vondra | 2011-11-25 10:25:19 | Re: General performance/load issue |