From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Віталій Тимчишин <tivv00(at)gmail(dot)com> |
Cc: | Anne Rosset <arosset(at)collab(dot)net>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Unexpected query plan results |
Date: | 2009-06-02 11:48:13 |
Message-ID: | D88C42E0-E231-406C-A189-6B4ABB3CCFF1@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Jun 2, 2009, at 6:20 AM, Віталій Тимчишин
<tivv00(at)gmail(dot)com> wrote:
>
>
> 2009/6/2 Robert Haas <robertmhaas(at)gmail(dot)com>
> On Mon, Jun 1, 2009 at 4:53 PM, Anne Rosset <arosset(at)collab(dot)net>
> wrote:
> >> On Mon, Jun 1, 2009 at 2:14 PM, Anne Rosset <arosset(at)collab(dot)net>
> wrote:
> >>> SELECT SUM(1) FROM item WHERE is_deleted = 'f'; sum --------- 1824592 (1
> >>> row)
> >>> SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641
> >>> </sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>'; sum
> --------
> >>> 122412 (1 row)
> >>> SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641
> >>> </sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>' AND
> is_deleted
> >>> =
> >>> 'f'; sum ----- 71 (1 row)
> >>> SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641
> >>> </sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>' AND
> is_deleted
> >>> =
> >>> 't'; sum -------- 122341 (1 row)
> >
> > The item table has 2324829 rows
>
> So 1824592/2324829 = 78.4% of the rows have is_deleted = false, and
> 0.06709% of the rows have the relevant folder_id. Therefore the
> planner assumes that there will be 2324829 * 78.4% * 0.06709% =~
> 96,000 rows that satisfy both criteria (the original explain had
> 97,000; there's some variability due to the fact that the analyze only
> samples a random subset of pages), but the real number is 71, leading
> it to make a very bad decision. This is a classic "hidden
> correlation" problem, where two columns are correlated but the planner
> doesn't notice, and you get a terrible plan.
>
> Unfortunately, I'm not aware of any real good solution to this
> problem. The two obvious approaches are multi-column statistics and
> planner hints; PostgreSQL supports neither.
>
> How about partial index (create index idx on item(folder_id) where
> not is_deleted)? Won't it have required statistics (even if it is
> not used in plan)?
I tried that; doesn't seem to work.
...Robert
From | Date | Subject | |
---|---|---|---|
Next Message | Matthew Wakeling | 2009-06-02 12:47:39 | Re: Very inefficient query plan with disjunction in WHERE clause |
Previous Message | Віталій Тимчишин | 2009-06-02 10:20:17 | Re: Unexpected query plan results |