From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com> |
Cc: | Dani Oderbolz <oderbolz(at)ecologic(dot)de>, pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: [SQL] "SELECT IN" Still Broken in 7.4b |
Date: | 2003-08-21 20:42:20 |
Message-ID: | 25062.1061498540@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-sql |
Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com> writes:
> Within the scope of the new hashed IN stuff I believe so in at least some
> cases. I have a few million row table of integers where searching for
> values IN (~10000 values) takes longer than creating the temp table,
> copying into it and doing the in subquery.
I did some profiling and soon realized that the main problem is the
executor's trick for not returning the same row multiple times in a
multiple-indexscan plan node. The point is that given
WHERE a = 1 OR b = 1
you could create a plan that first indexscans on a, then indexscans on
b --- but you mustn't return any tuples in the second scan that you
already returned in the first. IndexNext solves this by evaluating the
prior-scan index conditions to see if they are true. Which is okay if
there aren't too many of them. But when you have an N-element IN list
this means you are going to do O(N^2) index expression evaluations.
In the 10000-element IN-list test case, ExecQual gets called almost
50 million times :-(
I'm toying with the notion of ripping out that logic and instead
building an in-memory hashtable of already-returned TIDs. This could
theoretically run out of memory if the multiple indexscan returns enough
tuples, but I think in practice that wouldn't happen because the planner
wouldn't choose an indexscan when very large numbers of tuples are being
selected.
Comments?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Edmund Dengler | 2003-08-21 20:42:24 | Re: Buglist |
Previous Message | Ian Barwick | 2003-08-21 20:28:52 | Re: [GENERAL] Need concrete "Why Postgres not MySQL" bullet list |
From | Date | Subject | |
---|---|---|---|
Next Message | Stephan Szabo | 2003-08-21 21:07:08 | Re: [SQL] "SELECT IN" Still Broken in 7.4b |
Previous Message | Stephan Szabo | 2003-08-21 17:28:53 | Re: "SELECT IN" Still Broken in 7.4b |