From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
Cc: | Magnus Hagander <mha(at)sollentuna(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Bad estimate on LIKE matching |
Date: | 2006-01-18 15:37:59 |
Message-ID: | 10172.1137598679@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> On Tue, 2006-01-17 at 13:53 +0100, Magnus Hagander wrote:
>> Any way to teach the planner about this?
> In a recent thread on -perform, I opined that this case could best be
> solved by using dynamic random block sampling at plan time followed by a
> direct evaluation of the LIKE against the sample. This would yield a
> more precise selectivity and lead to the better plan. So it can be
> improved for the next release.
I find it exceedingly improbable that we'll ever install any such thing.
On-the-fly sampling of enough rows to get a useful estimate would
increase planning time by orders of magnitude --- and most of the time
the extra effort would be unhelpful. In the particular case exhibited
by Magnus, it is *really* unlikely that any such method would do better
than we are doing now. He was concerned because the planner failed to
tell the difference between selectivities of about 1e-4 and 1e-6.
On-the-fly sampling will do better only if it manages to find some of
those rows, which it is unlikely to do with a sample size less than
1e5 or so rows. With larger tables the problem gets rapidly worse.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Martijn van Oosterhout | 2006-01-18 16:32:58 | Re: Unique constraints for non-btree indexes |
Previous Message | Greg Stark | 2006-01-18 14:53:54 | Re: Surrogate keys (Was: enums) |