From: | Jim Nasby <jim(at)nasby(dot)net> |
---|---|
To: | Joe Conway <mail(at)joeconway(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Excessive WAL generation and related performance issue |
Date: | 2014-04-14 23:04:37 |
Message-ID: | 534C6985.6060406@nasby.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 4/14/14, 5:51 PM, Joe Conway wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 04/14/2014 03:17 PM, Jim Nasby wrote:
>> On 4/14/14, 4:50 PM, Andres Freund wrote:
>>> On 2014-04-14 14:33:03 -0700, Joe Conway wrote:
>>>> I realize there are many things that can be done to improve my
>>>> specific scenario, e.g. drop indexes before loading, change
>>>> various configs, etc. My purpose for this post is to ask if it
>>>> is really expected to get over 20 times as much WAL as heap
>>>> data?
>>>
>>> I'd bet a large percentage of this will be full page images of
>>> the index. The values you index are essentially distributed over
>>> the whole index, so you'll modifiy the same indx values
>>> repeatedly. But often enough it won't be in the same checkpoint
>>> and thus will create full page images.
>>
>> My thought exactly...
>>
>> ISTM that we should be able to push all the index inserts to the
>> end of the transaction. That should greatly reduce the amount of
>> full page writes. That would also open the door for doing all the
>> index inserts in parallel.
>
> That's the thing. I'm sure there is tuning and other things to improve
> this particular case, but creating over 20 times as much WAL as real
> data seems like pathological behavior to me.
Can you take a look at what's actually going into WAL when the wheels fall off? I think it should be pretty easy to test the theory that it's a ton of full page writes of index leaf pages...
--
Jim C. Nasby, Data Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2014-04-14 23:11:05 | Re: Including replication slot data in base backups |
Previous Message | Jim Nasby | 2014-04-14 23:02:46 | Re: Clock sweep not caching enough B-Tree leaf pages? |