From: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
---|---|
To: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Index-only scans |
Date: | 2009-07-14 08:23:46 |
Message-ID: | 4A5C4092.6060607@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Simon Riggs wrote:
> On Mon, 2009-07-13 at 10:16 +0300, Heikki Linnakangas wrote:
>
>> Implementing index-only scans requires a few changes:
>
> I would like to see a clear exposition of the use cases and an an
> analysis of the costs and benefits of doing this. It sounds cool, but I
> want to know it is cool before we spend time solving all of the juicy
> problems. Perhaps a glue-and-string patch would help.
There's a working prototype at in my git repository at git.postgresql.org.
> Extra buffer accesses for vismap, crash-safe vismap sound like
> performance issues, as well as planner time, not to mention all the
> tuits needed. Will it damage the general case?
It does add some work to the planner, but I don't think it's noticeable.
The visibility map accesses are only needed when we're doing an
index-only scan, not in the general case, so the impact of those come
down to how well we can estimate the cost of index-only scans, so that
an index-only scan is not chosen when not beneficial.
> The single SQL example mentioned already has at least two mechanisms for
> improving performance of that type of query. We probably don't need
> another, or at least we need a good analysis of why.
Well, another class of queries where index-only scans are beneficial is
when you fetch a range of rows from index, where the heap fetches result
in a lot of random I/O. Clustering helps with that, but you can only
cluster a table on one column. A classic example where that's a problem
is a many-to-many relationship:
CREATE TABLE a (aid integer, ...);
CREATE TABLE b (bid integer, ...);
CREATE TABLE manytomany (aid integer, bid integer);
CREATE INDEX a_b ON manytomany (aid, bid);
CREATE INDEX b_a ON manytomany (bid, aid);
If you need to query the many-to-many relationship in "both directions", ie:
SELECT bid FROm manytomany WHERE aid = ?
SELECT aid FROM manytomany WHERE bid = ?
You have to choose which index you cluster the table on, which will be
fast, and the other query will be slow.
> The benefit that occurs to me most is covered indexes, i.e. it opens up
> new and interesting indexing strategies. Covered indexes are also one
> kind of materialized view. It may be better to implement mat views and
> gain wider benefits too.
Materialized view sure would be nice, but doesn't address quite the same
use cases. Doesn't help with the many-to-many example above, for
example. We should have both.
> Or maybe index-only scans are mat views, via
> some cunning plan?
Heh, no :-).
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2009-07-14 08:38:50 | Re: Index-only scans |
Previous Message | Jaime Casanova | 2009-07-14 07:58:48 | Re: COPY WITH CSV FORCE QUOTE * |