From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Teodor Sigaev <teodor(at)sigaev(dot)ru> |
Subject: | Re: Index-only-scans, indexam API changes |
Date: | 2009-07-21 17:10:57 |
Message-ID: | 603c8f070907211010v266bcf2w744bfd533272a7a0@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jul 13, 2009 at 11:32 AM, Heikki
Linnakangas<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> Tom Lane wrote:
>> One thought here is that an AM call isn't really free, and doing two of
>> them instead of one mightn't be such a good idea. I would suggest
>> either having a separate AM entry point to get both bits of data
>> ("amgettupledata"?) or adding an optional parameter to amgettuple.
>
> I'm thinking of adding a new flag to IndexScanDesc, "needIndexTuple".
> When that is set, amgettuple stores a pointer to the IndexTuple in
> another new field in IndexScanDesc, xs_itup. (In case of b-tree at
> least, the IndexTuple is a palloc'd copy of the tuple on disk)
>
>> [ thinks a bit ... ] At least for GIST, it is possible that whether
>> data can be regurgitated will vary depending on the selected opclass.
>> Some opclasses use the STORAGE modifier and some don't. I am not sure
>> how hard we want to work to support flexibility there. Would it be
>> sufficient to hard-code the check as "pgam says the AM can do it,
>> and the opclass does not have a STORAGE property"? Or do we need
>> additional intelligence about GIST opclasses?
>
> Well, the way I have it implemented is that the am is not required to
> return the index tuple, even if requested. I implemented the B-tree
> changes similar to how we implement "kill_prior_tuple": in btgettuple,
> lock the index page and see if the tuple is still at the same position
> that we remembered in the scan opaque struct. If not (because of
> concurrent changes to the page), we give up and don't return the index
> tuple. The executor will then perform a heap fetch as before.
>
> Alternatively, we could copy all the matching index tuples to private
> memory when we step to a new index page, but that seems pretty expensive
> if we're only going to use the first few matching tuples (LIMIT), and I
> fear the memory management gets complicated. But in any case, the GiST
> issue would still be there.
>
> Since we're discussing it, I'm attaching the prototype patch I have for
> returning tuples from b-tree and using them to filter rows before heap
> fetches. I was going to post it in the morning along with description
> about the planner and executor changes, but here goes. It applies on top
> of the indexam API patch I started this thread with.
I'm not sure where we are on this patch for reviewing purposes.
Heikki, are you planning to provide an updated patch? Or what should
we be doing here from an RRR standpoint?
...Robert
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2009-07-21 17:20:17 | Re: Non-blocking communication between a frontend and a backend (pqcomm) |
Previous Message | Tom Lane | 2009-07-21 16:58:11 | Re: full join qualifications on 8.3.1 vs. 8.3.6 |