From: | Manfred Koizar <mkoi-pg(at)aon(dot)at> |
---|---|
To: | "Luis Alberto Amigo Navarro" <lamigo(at)atc(dot)unican(dot)es> |
Cc: | <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] please help on query |
Date: | 2002-07-11 16:47:03 |
Message-ID: | gu9riu8qp28ifp40qsva7b9puif47ihi5b@4ax.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-sql |
[moving to pgsql-sql]
On Thu, 11 Jul 2002 17:22:14 +0200, "Luis Alberto Amigo Navarro"
<lamigo(at)atc(dot)unican(dot)es> wrote:
>I can't improve performance on this query:
>
>SELECT
> supplier.name,
> supplier.address
>FROM
> supplier,
> nation
>WHERE
> supplier.suppkey IN(
> SELECT
> partsupp.suppkey
> FROM
> partsupp
> WHERE
> partsupp.partkey IN(
> SELECT
> part.partkey
> FROM
> part
> WHERE
> part.name like 'forest%'
> )
> AND partsupp.availqty>(
> SELECT
> 0.5*(sum(lineitem.quantity)::FLOAT)
> FROM
> lineitem
> WHERE
> lineitem.partkey=partsupp.partkey
> AND lineitem.suppkey=partsupp.partkey
^^^^^^^
suppkey ???
> AND lineitem.shipdate>=('1994-01-01')::DATE
> AND lineitem.shipdate<(('1994-01-01')::DATE+('1 year')::INTERVAL)::DATE
> )
> )
> AND supplier.nationkey=nation.nationkey
> AND nation.name='CANADA'
>ORDER BY
> supplier.name;
Luis,
rules of thumb: "Avoid subselects; use joins!" and "If you have to use
subselects, avoid IN, use EXISTS!"
Let's try. If partkey is unique in part, then
| FROM partsupp
| WHERE partsupp.partkey IN (SELECT part.partkey
can be replaced by
FROM partsupp ps, part p
WHERE ps.partkey = p.partkey
or
partsupp ps INNER JOIN part p
ON (ps.partkey = p.partkey AND p.name LIKE '...')
When we ignore "part" for now, your subselect boils down to
| SELECT partsupp.suppkey
| FROM partsupp
| WHERE partsupp.availqty > (
| SELECT 0.5*(sum(lineitem.quantity)::FLOAT)
| FROM lineitem
| WHERE lineitem.partkey=partsupp.partkey
| AND lineitem.suppkey=partsupp.suppkey
| AND lineitem.shipdate BETWEEN ... AND ...
| )
which can be rewritten to (untested)
SELECT ps.suppkey
FROM partsupp ps, lineitem li
WHERE li.partkey=ps.partkey
AND li.suppkey=ps.suppkey
AND lineitem.shipdate BETWEEN ... AND ...
GROUP BY ps.partkey, ps.suppkey
HAVING min(ps.availqty) > 0.5*(sum(lineitem.quantity)::FLOAT)
^^^
As all ps.availqty are equal in one group, you can as well
use max() or avg().
Now we have left only one IN:
| WHERE supplier.suppkey IN (
| SELECT partsupp.suppkey FROM partsupp WHERE <condition> )
Being to lazy to find out, if this can be rewritten to a join, let`s
apply rule 2 here:
WHERE EXISTS (
SELECT ... FROM partsupp ps
WHERE supplier.suppkey = ps.suppkey
AND <condition> )
HTH, but use with a grain of salt ...
>Sort (cost=2777810917708.17..2777810917708.17 rows=200 width=81)
^^^^^^^^^^^^^^^^
BTW, how many years are these? :-)
Servus
Manfred
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Lockhart | 2002-07-11 17:38:17 | Re: workaround for lack of REPLACE() function |
Previous Message | Bruce Momjian | 2002-07-11 16:46:06 | Re: Should this require CASCADE? |
From | Date | Subject | |
---|---|---|---|
Next Message | Luis Alberto Amigo Navarro | 2002-07-11 17:40:46 | Re: [HACKERS] please help on query |
Previous Message | Stephan Szabo | 2002-07-11 16:00:48 | Re: how to inherits the references... |