From: | Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz> |
---|---|
To: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Leonardo Francalanci <m_lists(at)yahoo(dot)it> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Fast insertion indexes: why no developments |
Date: | 2013-10-30 18:38:00 |
Message-ID: | 52715208.4010608@archidevsys.co.nz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 31/10/13 06:46, Jeff Janes wrote:
> On Wed, Oct 30, 2013 at 9:54 AM, Leonardo Francalanci
> <m_lists(at)yahoo(dot)it <mailto:m_lists(at)yahoo(dot)it>> wrote:
>
> Jeff Janes wrote
> > The index insertions should be fast until the size of the active
> part of
> > the indexes being inserted into exceeds shared_buffers by some
> amount
> > (what
> > that amount is would depend on how much dirty data the kernel is
> willing
> > to
> > allow in the page cache before it starts suffering anxiety about
> it). If
> > you have enough shared_buffers to make that last for 15 minutes,
> then you
> > shouldn't have a problem inserting with live indexes.
>
> Sooner or later you'll have to checkpoint those shared_buffers...
>
>
> True, but that is also true of indexes created in bulk. It all has to
> reach disk eventually--either the checkpointer writes it out and
> fsyncs it, or the background writer or user backends writes it out and
> the checkpoint fsyncs it. If bulk creation uses a ring buffer
> strategy (I don't know if it does), then it might kick the buffers to
> kernel in more or less physical order, which would help the kernel get
> them to disk in long sequential writes. Or not. I think that this is
> where sorted checkpoint could really help.
>
> > and we are
> > talking about GB of data (my understanding is that we change
> basically every
> > btree page, resulting in re-writing of the whole index).
>
> If the checkpoint interval is as long as the partitioning period, then
> hopefully the active index buffers get re-dirtied while protected in
> shared_buffers, and only get written to disk once. If the buffers get
> read, dirtied, and evicted from a small shared_buffers over and over
> again then you are almost guaranteed that will get written to disk
> multiple times while they are still hot, unless your kernel is very
> aggressive about caching dirty data (which will cause other problems).
>
> Cheers,
>
> Jeff
How about being able to mark indexes:
'MEMORY ONLY' to make them not go to disk
and
'PERSISTENT | TRANSIENT' to mark if they should be recreated on
machine bootup?
or something similar
Cheers,
Gavin
From | Date | Subject | |
---|---|---|---|
Next Message | Claudio Freire | 2013-10-30 18:40:13 | Re: Fast insertion indexes: why no developments |
Previous Message | Jeff Janes | 2013-10-30 17:46:59 | Re: Fast insertion indexes: why no developments |