From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Abhijit Menon-Sen <ams(at)oryx(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org, arnt(at)oryx(dot)com |
Subject: | Re: 10x rowcount mis-estimation favouring merge over nestloop |
Date: | 2006-11-10 06:15:24 |
Message-ID: | 16547.1163139324@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Abhijit Menon-Sen <ams(at)oryx(dot)com> writes:
> The header_fields table contains 13.5M rows, of which only ~250K match
> the where condition. I created an index like this:
> create index hffpv on header_fields(field)
> where field<=12 and (part!='' or value ilike '%,%')
> Note the 2M estimated rowcount in the bitmap index scan on header_fields
> vs. the actual number (264180).
I think this is basically a lack-of-column-correlation-stats problem.
The planner is estimating this on the basis of the overall selectivity
of the "field<=12" condition, but it seems that "field<=12" is true for
a much smaller fraction of the rows satisfying (part!='' or value ilike '%,%')
than for the general population of rows in the header_fields table.
There's been some speculation about obtaining stats on partial indexes
as a substitute for solving the general problem of correlation stats,
but I for one don't have a very clear understanding of how it'd work.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Abhijit Menon-Sen | 2006-11-10 07:07:00 | Re: 10x rowcount mis-estimation favouring merge over nestloop |
Previous Message | Abhijit Menon-Sen | 2006-11-10 05:12:45 | 10x rowcount mis-estimation favouring merge over nestloop |