Re: Excessive WAL generation and related performance issue

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Jim Nasby <jim(at)nasby(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Excessive WAL generation and related performance issue
Date: 2014-04-15 00:40:07
Message-ID: 20140415004007.GY2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

* Joe Conway (mail(at)joeconway(dot)com) wrote:
> That's the thing. I'm sure there is tuning and other things to improve
> this particular case, but creating over 20 times as much WAL as real
> data seems like pathological behavior to me.

Setting things up such that you are updating a single value on each page
in an index during each checkpoint, which then requires basically
rewriting the entire index as full page writes at checkpoint time, is
definitely pathological behavior- but sadly, not behavior we are likely
to be able to fix..

This sounds like a great example of the unlogged table -> logged table
use-case and makes me wonder if we could provide an optimization similar
to the existing CREATE TABLE + COPY under wal_level = minimal case,
where we wouldn't WAL log anything for CREATE TABLE + COPY even when
wal_level is above minimal, until COMMIT, at which point we'll blast the
whole thing out in one shot.

Another option that you might consider is ordering your input, if
possible, to improve the chances that the same page is changed multiple
times inside a given checkpoint, hopefully reducing the number of pages
changed.

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2014-04-15 00:43:21 Re: Clock sweep not caching enough B-Tree leaf pages?
Previous Message Bruce Momjian 2014-04-15 00:30:16 Re: Clock sweep not caching enough B-Tree leaf pages?