Re: Is Hash Agg being used? 7.4 seems to handle this query worse than 7.3

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, pgsql-general(at)postgresql(dot)org
Subject: Re: Is Hash Agg being used? 7.4 seems to handle this query worse than 7.3
Date: 2003-02-11 17:02:59
Message-ID: 87u1fapx6k.fsf@stark.dyndns.tv
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Greg Stark <gsstark(at)mit(dot)edu> writes:
> > Hm, CVS doesn't seem to be using a hash aggregate. At least, if it is it isn't
> > obvious from the plan.
>
> > SELECT hier.level_0_id as parent_id,
> > (select localized_text from localized_text where text_id = hier.short_name_text_id and lang_code = 'en') as name,
> > *
> > FROM hier LEFT OUTER JOIN (
> > SELECT distinct level_0_id, level_1_id
> > FROM cache_foo JOIN foo_hier USING (foo_id)
> > WHERE key_value = 839
> > AND dist < 60
> > ) AS cache ON (hier.hier_id = cache.level_1_id)
> > WHERE level = 1
> > ORDER BY 1,2
>
> Why would you expect hash aggregation to be used here? There's no
> aggregates ... nor even any GROUP BY.

Well, "SELECT distinct level_0_id, level_1_id" is equivalent to a GROUP BY
level_0_id, level_1_id.

Um, I think I grabbed the wrong query from the logs though, sorry. Here's a
better example from the actual code, in fact I think it's what the above query
turned into after more work.

There's only a small decrease in speed from 7.3 to CVS now, but I was hoping
for a big speed increase from hash aggregates since most of the time is being
sunk into that sort. But it definitely isn't using them. I guess TNSTAAFL.

SELECT hier.level_0_id as parent_id,
(select localized_text from localized_text where text_id = hier.short_name_text_id and lang_code = 'en') as name,
*
FROM hier LEFT OUTER JOIN (
SELECT min(dist) AS mindist, count(distinct foo_id) AS num_foos, level_1_id
FROM cache_foos JOIN foo_hier USING (foo_id)
WHERE key_id = 839
AND dist < 60
GROUP BY level_0_id, level_1_id
) AS cache ON (hier.hier_id = cache.level_1_id)
WHERE level = 1
ORDER BY level_0_id;

CVS:

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=3936.87..3937.06 rows=76 width=1596) (actual time=1047.13..1047.16 rows=31 loops=1)
Sort Key: hier.level_0_id
-> Hash Left Join (cost=2989.02..3934.50 rows=76 width=1596) (actual time=1044.90..1046.87 rows=31 loops=1)
Hash Cond: ("outer".hier_id = "inner".level_1_id)
-> Seq Scan on hier (cost=0.00..63.86 rows=14 width=1576) (actual time=7.92..8.13 rows=31 loops=1)
Filter: ("level" = 1)
-> Hash (cost=2951.21..2951.21 rows=15122 width=24) (actual time=1033.78..1033.78 rows=0 loops=1)
-> Subquery Scan "cache" (cost=2686.58..2951.21 rows=15122 width=24) (actual time=917.66..1033.60 rows=31 loops=1)
-> GroupAggregate (cost=2686.58..2951.21 rows=15122 width=24) (actual time=917.64..1033.40 rows=31 loops=1)
-> Sort (cost=2686.58..2724.38 rows=15122 width=24) (actual time=913.00..931.40 rows=16440 loops=1)
Sort Key: foo_hier.level_0_id, foo_hier.level_1_id
-> Merge Join (cost=123.56..1636.78 rows=15122 width=24) (actual time=280.80..779.05 rows=16440 loops=1)
Merge Cond: ("outer".foo_id = "inner".foo_id)
-> Index Scan using foo_hier_foo on foo_hier (cost=0.00..1173.54 rows=45140 width=12) (actual time=0.04..225.13 rows=45140 loops=1)
-> Sort (cost=123.56..123.73 rows=67 width=12) (actual time=280.69..302.62 rows=16441 loops=1)
Sort Key: cache_foos.foo_id
-> Index Scan using idx_cache_foos on cache_foos (cost=0.00..121.53 rows=67 width=12) (actual time=0.05..128.19 rows=16486 loops=1)
Index Cond: ((key_id = 839) AND (dist < 60::double precision))
SubPlan
-> Index Scan using localized_text_pkey on localized_text (cost=0.00..4.01 rows=1 width=516) (actual time=0.03..0.03 rows=1 loops=31)
Index Cond: ((text_id = $0) AND (lang_code = 'en'::bpchar))
Total runtime: 1058.63 msec

7.3:

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=4103.84..4103.92 rows=31 width=97) (actual time=1033.79..1033.82 rows=31 loops=1)
Sort Key: hier.level_0_id
-> Merge Join (cost=4094.32..4103.08 rows=31 width=97) (actual time=1031.62..1033.54 rows=31 loops=1)
Merge Cond: ("outer".hier_id = "inner".level_1_id)
-> Sort (cost=64.63..64.71 rows=31 width=77) (actual time=7.92..7.96 rows=31 loops=1)
Sort Key: hier.hier_id
-> Seq Scan on hier (cost=0.00..63.86 rows=31 width=77) (actual time=7.25..7.77 rows=31 loops=1)
Filter: ("level" = 1)
-> Sort (cost=4029.69..4033.87 rows=1674 width=24) (actual time=1023.54..1023.58 rows=31 loops=1)
Sort Key: "cache".level_1_id
-> Subquery Scan "cache" (cost=3730.75..3940.04 rows=1674 width=24) (actual time=829.88..1023.35 rows=31 loops=1)
-> Aggregate (cost=3730.75..3940.04 rows=1674 width=24) (actual time=829.86..1023.14 rows=31 loops=1)
-> Group (cost=3730.75..3856.32 rows=16743 width=24) (actual time=822.88..940.44 rows=16440 loops=1)
-> Sort (cost=3730.75..3772.61 rows=16743 width=24) (actual time=822.86..841.47 rows=16440 loops=1)
Sort Key: foo_hier.level_0_id, foo_hier.level_1_id
-> Hash Join (cost=1410.87..2556.15 rows=16743 width=24) (actual time=347.17..662.63 rows=16440 loops=1)
Hash Cond: ("outer".foo_id = "inner".foo_id)
-> Index Scan using idx_cache_foos on cache_foos (cost=0.00..810.43 rows=16743 width=12) (actual time=0.07..152.56 rows=16486 loops=1)
Index Cond: ((key_id = 839) AND (dist < 60::double precision))
-> Hash (cost=746.40..746.40 rows=45140 width=12) (actual time=345.53..345.53 rows=0 loops=1)
-> Seq Scan on foo_hier (cost=0.00..746.40 rows=45140 width=12) (actual time=0.06..213.59 rows=45140 loops=1)
SubPlan
-> Index Scan using localized_text_pkey on localized_text (cost=0.00..4.03 rows=1 width=17) (actual time=0.03..0.03 rows=1 loops=31)
Index Cond: ((text_id = $0) AND (lang_code = 'en'::bpchar))
Total runtime: 1039.57 msec

--
greg

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2003-02-11 17:09:51 Re: Fw: Priority against catalog
Previous Message Phil Howard 2003-02-11 16:57:47 mailing list archives