Quick Links

Re: Batch update of indexes on data loading

From:	ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Batch update of indexes on data loading
Date:	2008-02-22 00:57:48
Message-ID:	20080222094033.8B6B.52131E4D@oss.ntt.co.jp
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> wrote:

> > The basic concept is spooling new coming data, and merge the spool and
> > the existing indexes into a new index at the end of data loading. It is
> > 5-10 times faster than index insertion per-row, that is the way in 8.3.
>
> Please see
> http://thread.gmane.org/gmane.comp.db.postgresql.general/102370/focus=102901

Yeah, BEFORE INSERT FOR EACH ROW trigger is one of the problems.
I think it is enough to disallow bulkloading if there are any
BEFORE INSERT triggers. It is not a serious limitation because
DBA often disables triggers in bulkloading for performance.

>> You could work around this if the indexscan code knew to go search in the
>> list of pending insertions, but that's pretty ugly and possibly slow too.

I heard it is used in Falcon storage engine in MySQL, so it seems to be
not so unrealistic approach.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

In response to

Re: Batch update of indexes on data loading at 2008-02-21 14:28:00 from Alvaro Herrera

Responses

Re: Batch update of indexes on data loading at 2008-02-22 02:26:24 from Josh Berkus

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Neil Conway	2008-02-22 02:11:06	Memory leaks on SRF rescan
Previous Message	Andrew Dunstan	2008-02-21 20:15:48	Re: Including PL/PgSQL by default