Re: HEAD seems to generate larger WAL regarding GIN index

From: Jesper Krogh <jesper(at)krogh(dot)cc>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: HEAD seems to generate larger WAL regarding GIN index
Date: 2014-03-20 20:12:03
Message-ID: 532B4B93.6060408@krogh.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 15/03/14 20:27, Heikki Linnakangas wrote:
> That said, I didn't expect the difference to be quite that big when
> you're appending to the end of the table. When the new entries go to
> the end of the posting lists, you only need to recompress and WAL-log
> the last posting list, which is max 256 bytes long. But I guess that's
> still a lot more WAL than in the old format.
>
> That could be optimized, but I figured we can live with it, thanks to
> the fastupdate feature. Fastupdate allows amortizing that cost over
> several insertions. But of course, you explicitly disabled that...

In a concurrent update environment, fastupdate as it is in 9.2 is not
really useful. It may be that you can bulk up insertion, but you have no
control over who ends up paying the debt. Doubling the amount of wal
from gin-indexing would be pretty tough for us, in 9.2 we generate
roughly 1TB wal / day, keeping it
for some weeks to be able to do PITR. The wal are mainly due to
gin-index updates as new data is added and needs to be searchable by
users. We do run gzip that cuts it down to 25-30% before keeping the for
too long, but doubling this is going to be a migration challenge.

If fast-update could be made to work in an environment where we both
have users searching the index and manually updating it and 4+ backend
processes updating the index concurrently then it would be a good
benefit to gain.

the gin index currently contains 70+ million records with and average
tsvector of 124 terms.

--
Jesper .. trying to add some real-world info.

> - Heikki
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Emanuel Calvo 2014-03-20 20:24:18 Patch for CREATE RULE sgml
Previous Message Josh Berkus 2014-03-20 20:07:05 Re: QSoC proposal: date_trunc supporting intervals