From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Varadharajan Mukundan <srinathsmn(at)gmail(dot)com> |
Cc: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Fwd: Slow Count-Distinct Query |
Date: | 2014-04-04 17:59:03 |
Message-ID: | CAMkU=1y9RqS7E7xsVkEqrhJ8ZtzDu1Cj6uwC4p7uWpZ5c5U-JQ@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Fri, Apr 4, 2014 at 2:31 AM, Varadharajan Mukundan
<srinathsmn(at)gmail(dot)com>wrote:
> Sorry that i just joined the list and have to break the thread to reply to
> Tom Lane's response on this @
> http://www.postgresql.org/message-id/13741.1396275339@sss.pgh.pa.us
>
>
> Note that the indexscan is actually *slower* than the seqscan so far as
>> the table access is concerned; if the table were big enough to not fit
>> in RAM, this would get very much worse. So I'm not impressed with trying
>> to force the optimizer's hand as you've done here --- it might be a nice
>> plan now, but it's brittle. See if a bigger work_mem improves matters
>> enough with the regular plan.
>
>
> I agree to the point that hand tuning optimiser is brittle and something
> that should not be done. But the reason to that was to force the index-only
> scan (Not the index scan). I feel Index-only scan would speed up given
> postgres is row oriented and we are running count-distinct on a column in a
> table with lot of columns (Say 6-7 in number). I think that is what have
> contributed to the gain in performance.
>
It looks like the original emailer wrote a query that the planner is not
smart enough to plan properly (A known limitation of that kind of query).
He then made a bunch of changes, none of which worked. He then re-wrote
the query into a form for which the planner does a better job on. What we
do not know is, what would happen if he undoes all of those other changes
and *just* uses the new form of the query?
>
> I did a similar test with around 2 million tuples with work_mem = 250 MB
> and got the query to respond with 2x speed up. But the speed-up got with
> index-only scan was huge and response was in sub-seconds whereas with
> work_mem the response was couple of seconds.
>
This change is almost certainly due to the change from a sort to a hash
aggregate, and nothing to do with the index-only scan at all.
From | Date | Subject | |
---|---|---|---|
Next Message | PARIS Nicolas | 2014-04-04 19:49:12 | Re: PGSQL 9.3 - Materialized View - multithreading |
Previous Message | Thom Brown | 2014-04-04 16:54:07 | Re: PGSQL 9.3 - Materialized View - multithreading |