Quick Links

Re: Re: significant performance hit whenever autovacuum runs after upgrading from 9.0 -> 9.1

From:	Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
To:	Lonni J Friedman <netllama(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Re: significant performance hit whenever autovacuum runs after upgrading from 9.0 -> 9.1
Date:	2012-05-23 21:45:38
Message-ID:	4FBD5A82.8050904@archidevsys.co.nz
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On 24/05/12 08:18, Lonni J Friedman wrote:
> On Wed, May 23, 2012 at 12:36 PM, Gavin Flower
> <GavinFlower(at)archidevsys(dot)co(dot)nz> wrote:
>> On 24/05/12 05:09, Lonni J Friedman wrote:
>>
>> On Wed, May 23, 2012 at 9:37 AM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>> Lonni J Friedman<netllama(at)gmail(dot)com> writes:
>>
>> After banging my head on the wall for a long time, I happened to
>> notice that khugepaged was consuming 100% CPU every time autovacuum
>> was running. I did:
>> echo "madvise"> /sys/kernel/mm/transparent_hugepage/defrag
>> and immediately the entire problem went away.
>>
>> Fascinating.
>>
>> In hindsight, sure. Before that, it was 2 days of horror.
>>
>> So this looks like a nasty Fedora16 kernel bug to me, or maybe
>> postgresql& Fedora16's default kernel settings are just not
>> compatible?
>>
>> I agree, kernel bug. What kernel version are you using exactly?
>>
>> I'm using the stock 3.3.5-2.fc16.x86_64 kernel that is in Fedora updates.
>>
>> Is anyone else using Fedora16& PostgreSQL-9.1 ?
>>
>> I use an F16 box daily, but can't claim to have done major performance
>> testing with it. Can you put together a summary of your nondefault
>> Postgres settings? I wonder whether it only kicks in for a certain
>> size of shared memory for instance.
>>
>> Oh yea, I'm quite certain that this is somehow related to my setup,
>> and not a generic problem with all F16/pgsql systems. For starters,
>> this problem isn't happening on any of the 3 standby systems, which
>> are all otherwise identical to the master in every respect. Also when
>> we had done some testing (prior to the upgrades), we never ran into
>> any of these problems. However our test environment was on smaller
>> scale hardware, with a much smaller number of clients (and overall
>> load).
>>
>> Here are the non default settings in postgresql.conf :
>> wal_level = hot_standby
>> archive_mode = on
>> archive_timeout = 61
>> max_wal_senders = 10
>> wal_keep_segments = 5000
>> hot_standby = on
>> log_autovacuum_min_duration = 2500
>> autovacuum_max_workers = 4
>> maintenance_work_mem = 1GB
>> checkpoint_completion_target = 0.7
>> effective_cache_size = 88GB
>> work_mem = 576MB
>> wal_buffers = 16MB
>> checkpoint_segments = 64
>> shared_buffers = 8GB
>> max_connections = 350
>>
>> Let me know if you have any other questions. I'd be happy to provide
>> as much information as possible if it can aid in fixing this bug.
>>
>> I think they will need details of things like: RAM, number/type processors,
>> number& type
>> of disks, disk controllers& any other hardware specs that might be relevant
>> etc.- at very
>> least: total RAM& number of spindles
> 16 core Xeon X5550 2.67GHz
> 128GB RAM
> $PGDATA sits on a RAID5 array comprised of 3 SATA disks. Its Linux's
> md software RAID.
How does this compare to your other machines running the same, or
similar, databases?
However, you do say that the other machines are indentical - but are the
other
machines different in any aspect, that might prove siginificant?

>
>> Also anything else running on the box.
> nothing else. its dedicated exclusively to postgresql.
>
>> Plus transaction load pattern - over time and read/write ratios.
> I'm not sure how I'd obtain this data. however, the patterns didn't
> change since the upgrade. If someone can point me in the right
> direction, I can at least obtain this data as its generated currently.
>
>> type/nature of queries
> I need some clarification on specifically what you're asking for here.

The complexity, structure, and features of the queries. Do you have lots
of sub queries,
and ORDER BY's? Also the number of tables accessed in a query. This is
heading into the
territory where others will be better placed to advise you as to what
might be relevant!

>
>> size of heavily accessed tables and their indexes
> there are several rather large tables (90 million+ rows), but most
> others are under 1M rows. However, most tables are accessed& written
> to with equal frequency.
The number, type, size, and usage of indexes might be relevant.

I hope the above comments help, probably others will have more specific
requests.

Cheers,
Gavin

In response to

Re: Re: significant performance hit whenever autovacuum runs after upgrading from 9.0 -> 9.1 at 2012-05-23 20:18:18 from Lonni J Friedman

Responses

Re: Re: significant performance hit whenever autovacuum runs after upgrading from 9.0 -> 9.1 at 2012-05-23 22:33:00 from Tom Lane
Re: Re: significant performance hit whenever autovacuum runs after upgrading from 9.0 -> 9.1 at 2012-05-24 19:28:14 from Lonni J Friedman

Browse pgsql-general by date

	From	Date	Subject
Next Message	Devrim GÜNDÜZ	2012-05-23 21:49:15	Re: Extreme PostgreSQL?
Previous Message	Lists	2012-05-23 21:37:38	Extreme PostgreSQL?