Quick Links

Re: [PATCHES] GIN improvements

From:	Jeff Davis <pgsql(at)j-davis(dot)com>
To:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [PATCHES] GIN improvements
Date:	2008-12-22 01:56:06
Message-ID:	1229910966.2285.76.camel@jdavis
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-patches

On Fri, 2008-12-12 at 20:36 +0300, Teodor Sigaev wrote:
> Changes:
> - added vacuum_delay_point() in gininsertcleanup
> - add trigger to run vacuum by number of inserted tuples from
> last vacuum. Number of inserted tuples represents number
> of really inserted tuples in index and it is calculated as
> tuples_inserted + tuples_updated - tuples_hot_updated.
> Trigger fires on instuples > vac_base_thresh because search time is linear
> on number of pending pages (tuples)

Hi,

Comments:

You use something like the following in a few places:

START_CRIT_SECTION();
...
l = PageAddItem(...);
if (l == InvalidOffsetNumber)
elog(ERROR, "failed to add item to index page in \"%s\"",
RelationGetRelationName(index));

It's no use using ERROR, because it will turn into PANIC, which is
obviously unacceptable. It looks to me like those conditions can't
happen anyway, so it's probably better to add a comment explaining why
it can't happen, and Assert().

2. It appears to be properly triggering autovacuum when only inserts
happen, so I think that issue is solved.

3. Simple performance result with autovacuum off:

create table random(i int[]);
insert into random select ARRAY[(random() * 10)::int, (random() *
10)::int, (random() * 10)::int, (random() * 10)::int, (random() *
10)::int, (random() * 10)::int, (random() * 10)::int, (random() *
10)::int, (random() * 10)::int, (random() * 10)::int] from
generate_series(1, 1000000);

\timing on
drop table foo;
create table foo(i int[]);
create index foogin on foo using gin (i);
insert into foo select i from random;
vacuum foo;

Without patch:
INSERT: 71s
VACUUM: 2s
total: 73s

With patch:
INSERT: 33s
VACUUM: 12s
total: 45s

So, there is a performance advantage. This was just a quick test to make
sure the numbers matched my expectations.

4. Heikki mentioned:
http://archives.postgresql.org/pgsql-hackers/2008-11/msg01832.php

"To make things worse, a query will fail if all the matching
fast-inserted tuples don't fit in the non-lossy tid bitmap."

That issue still remains, correct? Is there a resolution to that?

5. I attached a newer version merged with HEAD.

6. You defined:

#define GinPageHasFullRow(page) ( GinPageGetOpaque(page)->flags &
GIN_LIST_FULLROW )

But in many places you still do the same check without using that macro.
The macro has only one call site, so I suggest either removing the macro
entirely, or using it every place you check that flag.

7. I don't understand this chunk of code:

ItemPointerData item = pos->item;

if ( scanGetCandidate(scan, pos) == false || !
ItemPointerEquals(&pos->item, &item) )
elog(ERROR,"Could not process tuple"); /* XXX should not be here !
*/

How can (!ItemPointerEquals(&pos->item, &item)) ever happen?

And how can (scanGetCandidate(scan, pos) == false) ever happen? Should
that be an Assert() instead?

If those can happen during normal operation, then we need a better error
message there.

Regards,
Jeff Davis

Attachment	Content-Type	Size
gin-fast-insert-20081221.patch	text/x-patch	85.1 KB

In response to

Re: [PATCHES] GIN improvements at 2008-12-12 17:36:55 from Teodor Sigaev

Responses

Re: [PATCHES] GIN improvements at 2009-01-16 12:39:36 from Teodor Sigaev

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	ITAGAKI Takahiro	2008-12-22 02:36:55	Re: generic reloptions improvement
Previous Message	Jaime Casanova	2008-12-22 00:22:29	Re: rules regression test failed on mingw

Browse pgsql-patches by date

	From	Date	Subject
Next Message	Heikki Linnakangas	2008-12-22 20:18:57	Re: [PATCHES] Infrastructure changes for recovery (v8)
Previous Message	Simon Riggs	2008-12-18 02:50:49	Re: [PATCHES] Infrastructure changes for recovery (v8)