Quick Links

Re: GIN fast insert

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: GIN fast insert
Date:	2009-02-11 03:38:47
Message-ID:	126.1234323527@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I think this is related to the problems with gincostestimate() that
> Tom Lane was complaining about here:
> http://archives.postgresql.org/message-id/20441.1234209245@sss.pgh.pa.us

> I am not 100% sure I'm understanding this correctly, but I think the
> reason why gincostestimate() is so desperate to avoid index scans when
> the pending list is long is because it knows that scanFastInsert()
> will blow up if an index scan is actually attempted because of the
> aforementioned TIDBitmap problem. This seems unacceptably fragile.

Yipes. If that's really the reason then I agree, it's a nonstarter.

> I think this code needs to be somehow rewritten to make things degrade
> gracefully when the pending list is long - I'm not sure what the best
> way to do that is. Inventing a new data structure to store TIDs that
> is never lossy seems like it might work, but you'd have to think about
> what to do if it got too big.

What would be wrong with letting it degrade to lossy? I suppose the
reason it's trying to avoid that is to avoid having to recheck all the
rows on that page when it comes time to do the index insertion; but
surely having to do that is better than having arbitrary, unpredictable
failure conditions.

It strikes me that part of the issue here is that the behavior of this
code is much better adapted to the bitmap-scan API than the traditional
indexscan API. Since GIN doesn't support ordered scan anyway, I wonder
whether it wouldn't be more sensible to simply allow it to not offer
the traditional API. It should be easy to make the planner ignore plain
indexscan plans for an AM that didn't support them.

regards, tom lane

In response to

GIN fast insert at 2009-02-11 02:59:54 from Robert Haas

Responses

Re: GIN fast insert at 2009-02-11 03:59:27 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Lawrence, Ramon	2009-02-11 03:51:13	Re: The testing of multi-batch hash joins with skewed data sets patch
Previous Message	Robert Haas	2009-02-11 02:59:54	GIN fast insert