From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | Teodor Sigaev <teodor(at)sigaev(dot)ru> |
Cc: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PATCHES] GIN improvements |
Date: | 2008-12-22 01:56:06 |
Message-ID: | 1229910966.2285.76.camel@jdavis |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
On Fri, 2008-12-12 at 20:36 +0300, Teodor Sigaev wrote:
> Changes:
> - added vacuum_delay_point() in gininsertcleanup
> - add trigger to run vacuum by number of inserted tuples from
> last vacuum. Number of inserted tuples represents number
> of really inserted tuples in index and it is calculated as
> tuples_inserted + tuples_updated - tuples_hot_updated.
> Trigger fires on instuples > vac_base_thresh because search time is linear
> on number of pending pages (tuples)
Hi,
Comments:
1.
You use something like the following in a few places:
START_CRIT_SECTION();
...
l = PageAddItem(...);
if (l == InvalidOffsetNumber)
elog(ERROR, "failed to add item to index page in \"%s\"",
RelationGetRelationName(index));
It's no use using ERROR, because it will turn into PANIC, which is
obviously unacceptable. It looks to me like those conditions can't
happen anyway, so it's probably better to add a comment explaining why
it can't happen, and Assert().
2. It appears to be properly triggering autovacuum when only inserts
happen, so I think that issue is solved.
3. Simple performance result with autovacuum off:
create table random(i int[]);
insert into random select ARRAY[(random() * 10)::int, (random() *
10)::int, (random() * 10)::int, (random() * 10)::int, (random() *
10)::int, (random() * 10)::int, (random() * 10)::int, (random() *
10)::int, (random() * 10)::int, (random() * 10)::int] from
generate_series(1, 1000000);
\timing on
drop table foo;
create table foo(i int[]);
create index foogin on foo using gin (i);
insert into foo select i from random;
vacuum foo;
Without patch:
INSERT: 71s
VACUUM: 2s
total: 73s
With patch:
INSERT: 33s
VACUUM: 12s
total: 45s
So, there is a performance advantage. This was just a quick test to make
sure the numbers matched my expectations.
4. Heikki mentioned:
http://archives.postgresql.org/pgsql-hackers/2008-11/msg01832.php
"To make things worse, a query will fail if all the matching
fast-inserted tuples don't fit in the non-lossy tid bitmap."
That issue still remains, correct? Is there a resolution to that?
5. I attached a newer version merged with HEAD.
6. You defined:
#define GinPageHasFullRow(page) ( GinPageGetOpaque(page)->flags &
GIN_LIST_FULLROW )
But in many places you still do the same check without using that macro.
The macro has only one call site, so I suggest either removing the macro
entirely, or using it every place you check that flag.
7. I don't understand this chunk of code:
ItemPointerData item = pos->item;
if ( scanGetCandidate(scan, pos) == false || !
ItemPointerEquals(&pos->item, &item) )
elog(ERROR,"Could not process tuple"); /* XXX should not be here !
*/
How can (!ItemPointerEquals(&pos->item, &item)) ever happen?
And how can (scanGetCandidate(scan, pos) == false) ever happen? Should
that be an Assert() instead?
If those can happen during normal operation, then we need a better error
message there.
Regards,
Jeff Davis
Attachment | Content-Type | Size |
---|---|---|
gin-fast-insert-20081221.patch | text/x-patch | 85.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | ITAGAKI Takahiro | 2008-12-22 02:36:55 | Re: generic reloptions improvement |
Previous Message | Jaime Casanova | 2008-12-22 00:22:29 | Re: rules regression test failed on mingw |
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2008-12-22 20:18:57 | Re: [PATCHES] Infrastructure changes for recovery (v8) |
Previous Message | Simon Riggs | 2008-12-18 02:50:49 | Re: [PATCHES] Infrastructure changes for recovery (v8) |