Re: Random performance hit, unknown cause.

From: Brian Fehrle <brianf(at)consistentstate(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Claudio Freire <klaussfreire(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Random performance hit, unknown cause.
Date: 2012-04-12 21:31:21
Message-ID: 4F8749A9.9080309@consistentstate.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Interesting, that is very likely.

In this system I have a table that is extremely active. On a 'normal'
day, the autovacuum process takes about 7 hours to complete on this
table, and once it's complete, the system performs an autoanalyze on the
table, finding that we have millions of new dead rows. Once this
happens, it kicks off the autovacuum again, so we basically always have
a vacuum running on this table at any given time.

If I were to tweak the autovacuum_vacuum_cost_delay parameter, what
would that be doing? Would it be limiting what the current autovacuum is
allowed to do? Or does it simply space out the time between autovacuum
runs? In my case, with 7 hour long autovacuums (sometimes 14 hours), a
few milliseconds between each vacuum wouldn't mean anything to me.

If that parameter does limit the amount of work autovacuum can do, It
may cause the system to perform better at that time, but would prolong
the length of the autovacuum right? That's an issue I'm already having
issue with, and wouldn't want to make the autovacuum any longer if I
don't need to.

- Brian F

On 04/12/2012 01:52 PM, Kevin Grittner wrote:
> Claudio Freire<klaussfreire(at)gmail(dot)com> wrote:
>> On Thu, Apr 12, 2012 at 3:41 PM, Brian Fehrle
>> <brianf(at)consistentstate(dot)com> wrote:
>>> This morning, during our nightly backup process (where we grab a
>>> copy of the data directory), we started having this same issue.
>>> The main thing that I see in all of these is a high disk wait on
>>> the system. When we are performing 'well', the %wa from top is
>>> usually around 30%, and our load is around 12 - 15. This morning
>>> we saw a load 21 - 23, and an %wa jumping between 60% and 75%.
>>>
>>> The top process pretty much at all times is the WAL Sender
>>> Process, is this normal?
>> Sounds like vacuum to me.
>
> More particularly, it seems consistent with autovacuum finding a
> large number of tuples which had reached their freeze threshold.
> Rewriting the tuple in place with a frozen xmin is a WAL-logged
> operation.
>
> -Kevin

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Grittner 2012-04-12 22:00:13 Re: Random performance hit, unknown cause.
Previous Message Claudio Freire 2012-04-12 19:59:10 Re: Random performance hit, unknown cause.