From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Martijn van Oosterhout <kleptog(at)svana(dot)org> |
Cc: | Joshua Tolley <eggyknap(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Cross-column statistics revisited |
Date: | 2008-10-16 17:20:30 |
Message-ID: | 17221.1224177630@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> I think you need to go a step back: how are you going to use this data?
The fundamental issue as the planner sees it is not having to assume
independence of WHERE clauses. For instance, given
WHERE a < 5 AND b > 10
our current approach is to estimate the fraction of rows with a < 5
(using stats for a), likewise estimate the fraction with b > 10
(using stats for b), and then multiply these fractions together.
This is correct if a and b are independent, but can be very bad if
they aren't. So if we had joint statistics on a and b, we'd want to
somehow match that up to clauses for a and b and properly derive
the joint probability.
(I'm not certain of how to do that efficiently, even if we had the
right stats :-()
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Stark | 2008-10-16 17:31:48 | Re: Cross-column statistics revisited |
Previous Message | Martijn van Oosterhout | 2008-10-16 17:11:26 | Re: Cross-column statistics revisited |