Re: still gin index creation takes forever

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it>, pgsql-general(at)postgresql(dot)org
Subject: Re: still gin index creation takes forever
Date: 2008-11-13 14:06:15
Message-ID: 491C3457.8080104@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> Yeah, I'm not convinced either. Still, Teodor's theory should be easily
> testable: set synchronize_seqscans to FALSE and see if the problem goes
> away.

Test suit to reproduce the problem:
DROP TABLE IF EXISTS foo;
DROP TABLE IF EXISTS footmp;

CREATE OR REPLACE FUNCTION gen_array()
RETURNS _int4 AS
$$
SELECT ARRAY(
SELECT (random()*1000)::int
FROM generate_series(1,10+(random()*90)::int)
)
$$
LANGUAGE SQL VOLATILE;

SELECT gen_array() AS v INTO foo FROM generate_series(1,100000);

VACUUM ANALYZE foo;

CREATE INDEX fooidx ON foo USING gin (v);
DROP INDEX fooidx;

SELECT * INTO footmp FROM foo LIMIT 90000;

CREATE INDEX fooidx ON foo USING gin (v);
DROP INDEX fooidx;

On my notebook with HEAD and default postgresql.conf it produce (show only
interesting part):

postgres=# CREATE INDEX fooidx ON foo USING gin (v);
Time: 14961,409 ms
postgres=# SELECT * INTO footmp FROM foo LIMIT 90000;
postgres=# CREATE INDEX fooidx ON foo USING gin (v);
LOG: checkpoints are occurring too frequently (12 seconds apart)
HINT: Consider increasing the configuration parameter "checkpoint_segments".
LOG: checkpoints are occurring too frequently (8 seconds apart)
HINT: Consider increasing the configuration parameter "checkpoint_segments".
LOG: checkpoints are occurring too frequently (7 seconds apart)
HINT: Consider increasing the configuration parameter "checkpoint_segments".
LOG: checkpoints are occurring too frequently (10 seconds apart)
HINT: Consider increasing the configuration parameter "checkpoint_segments".
LOG: checkpoints are occurring too frequently (8 seconds apart)
HINT: Consider increasing the configuration parameter "checkpoint_segments".
CREATE INDEX
Time: 56286,507 ms

So, time for creation is 4-time bigger after select.
Without "SELECT * INTO footmp FROM foo LIMIT 90000;":
postgres=# CREATE INDEX fooidx ON foo USING gin (v);
CREATE INDEX
Time: 13894,050 ms
postgres=# CREATE INDEX fooidx ON foo USING gin (v);
LOG: checkpoints are occurring too frequently (14 seconds apart)
HINT: Consider increasing the configuration parameter "checkpoint_segments".
CREATE INDEX
Time: 15087,348 ms

Near to the same time.

With synchronize_seqscans = off and SELECT:
postgres=# CREATE INDEX fooidx ON foo USING gin (v);
CREATE INDEX
Time: 14452,024 ms
postgres=# SELECT * INTO footmp FROM foo LIMIT 90000;
postgres=# CREATE INDEX fooidx ON foo USING gin (v);
LOG: checkpoints are occurring too frequently (16 seconds apart)
HINT: Consider increasing the configuration parameter "checkpoint_segments".
CREATE INDEX
Time: 14557,750 ms

Again, near to the same time.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Teodor Sigaev 2008-11-13 14:08:21 Re: still gin index creation takes forever
Previous Message Holger Hoffstaette 2008-11-13 14:03:22 Re: Upgrading side by side in Gentoo