From: | Tomas Vondra <tv(at)fuzzy(dot)cz> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: queries with DISTINCT / GROUP BY giving different plans |
Date: | 2013-08-16 19:36:19 |
Message-ID: | 520E7F33.7010305@fuzzy.cz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 14.8.2013 20:35, Tom Lane wrote:
> "Tomas Vondra" <tv(at)fuzzy(dot)cz> writes:
>> I've run into a strange plan difference on 9.1.9 - the first query
>> does "DISTINCT" by doing a GROUP BY on the columns (both INT). ...
>> Now, this takes ~45 seconds to execute, but after rewriting the
>> query to use the regular DISTINCT it suddenly switches to
>> HashAggregate with ~1/3 the cost (although it produces the same
>> output, AFAIK), and it executes in ~15 seconds.
>
> [ scratches head... ] I guess you're running into some corner case
> where choose_hashed_grouping and choose_hashed_distinct make
> different choices. It's going to be tough to debug without a test
> case though. I couldn't reproduce the behavior in a few tries here.
>
>> BTW I can't test this on 9.2 or 9.3 easily, as this is our
>> production environment and I can't just export the data. I've tried
>> to simulate this but so far no luck.
>
> I suppose they won't yet you step through those two functions with a
> debugger either ...
OK, this time the complete message ...
I've managed to get the data to a different machine, and I've spent some
time on debugging it. It seems that the difference is in evaluating
hashentrysize - while choose_hashed_distinct does this:
/*
* Don't do it if it doesn't look like the hashtable will fit into
* work_mem.
*/
hashentrysize = MAXALIGN(path_width)
+ MAXALIGN(sizeof(MinimalTupleData));
if (hashentrysize * dNumDistinctRows > work_mem * 1024L)
return false;
while choose_hashed_grouping does this:
/* Estimate per-hash-entry space at tuple width... */
hashentrysize = MAXALIGN(path_width)
+ MAXALIGN(sizeof(MinimalTupleData));
/* plus space for pass-by-ref transition values... */
hashentrysize += agg_costs->transitionSpace;
/* plus the per-hash-entry overhead */
hashentrysize += hash_agg_entry_size(agg_costs->numAggs);
if (hashentrysize * dNumGroups > work_mem * 1024L)
return false;
In both cases the common parameter values are
dNumGroups = dNumDistinctRows = 20451018
work_mem = 819200
but the hashentrysize size is 24 (choose_hashed_distinct) or 56
(choose_hashed_grouping). This causes that while _distinct evaluates the
condition as false, and _grouping as true (and thus returns false).
Now, the difference between 24 and 56 is caused by hash_agg_entry_size.
It's called with numAggs=0 but returns 32. I'm wondering if it should
return 0 in such cases, i.e. something like this:
Size
hash_agg_entry_size(int numAggs)
{
Size entrysize;
if (numAggs == 0)
return 0;
/* This must match build_hash_table */
entrysize = sizeof(AggHashEntryData) +
(numAggs - 1) * sizeof(AggStatePerGroupData);
entrysize = MAXALIGN(entrysize);
/* Account for hashtable overhead (assuming fill factor = 1) */
entrysize += 3 * sizeof(void *);
return entrysize;
}
I've tested that after this both queries use HashAggregate (which is the
right choice), but I haven't done any extensive checking so maybe I'm
missing something.
Tomas
From | Date | Subject | |
---|---|---|---|
Next Message | Robert DiFalco | 2013-08-17 20:19:10 | Create one query out of two |
Previous Message | Tomas Vondra | 2013-08-16 19:20:17 | Re: queries with DISTINCT / GROUP BY giving different plans |