From: | Gregory Stark <stark(at)enterprisedb(dot)com> |
---|---|
To: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "pgsql-hackers list" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: New style of hash join proposal |
Date: | 2008-03-17 16:52:18 |
Message-ID: | 87tzj5b8e5.fsf@oxford.xeocode.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
"Gregory Stark" <stark(at)enterprisedb(dot)com> writes:
> It would be ideal if it could scan the invoices using an index, toss them all
> in a hash, then do a bitmap index scan to pull out all the matching detail
> records. If there are multiple batches it can start a whole new index scan for
> the each of the batches.
A more general solution to this would be to find a way to tackle the general
problem of postponing the heap lookups until we really need columns which
aren't present in the index keys.
So something like the aforementioned
select * from invoice join invoice_detail on (invoice_id) where invoice.quarter='Q4'
could be done doing something like
Heap Scan on invoice_detail
-> Heap Scan on invoice
-> Nested Loop
-> Index Scan on invoice_quarter
Index Cond: (quarter='Q4')
-> Index Scan on pk_invoice_detail
Index Cond: (invoice_id = $0)
But that would be a much more wide-ranging change. And it would still not be
sequential unless we do extra work to sort the index tuples by tid.
There would be plenty of fiddly bits around which paths it's safe to execute
prior to checking the visibility as well. Obviously the visibility would have
to be checked before things like Unique or Aggregate nodes.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's RemoteDBA services!
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Smith | 2008-03-17 16:52:27 | Re: [PATCHES] [0/4] Proposal of SE-PostgreSQL patches |
Previous Message | Zdenek Kotala | 2008-03-17 16:51:15 | How large file is really large - pathconf results |