From: | PostgreSQL - Hans-Jürgen Schönig <postgres(at)cybertec(dot)at> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>, Boszormenyi Zoltan <zb(at)cybertec(dot)at> |
Subject: | Re: cross column correlation revisted |
Date: | 2010-07-14 14:41:01 |
Message-ID: | A28B0AA5-34DA-4D5A-A78F-76E4F78FA310@cybertec.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
hello ...
look at the syntax i posted in more detail:
>> ALTER TABLE x SET CORRELATION STATISTICS FOR (x.id = y.id AND x.id2 = y.id2)
>
it says X and Y ...
the selectivity of joins are what i am most interested in. cross correlation of columns within the same table are just a byproduct.
the core thing is: how can i estimate the number of rows returned from a join?
an example would be: you have a email accounts + messages. you know that each row will match in a join as you can assume that every account will have a message.
what we need is a syntax which covers the join case and the case where columns inside the same table correlate.
and the fact that an index cannot cover two tables leads me to the conclusion that stats on an index are not the solution to the join problem.
many thanks,
hans
On Jul 14, 2010, at 4:22 PM, Tom Lane wrote:
> =?iso-8859-1?Q?PostgreSQL_-_Hans-J=FCrgen_Sch=F6nig?= <postgres(at)cybertec(dot)at> writes:
>> i think that having stats on an index is a problem by itself for 2 reasons - for cross column correlation at least:
>
>> a.) joins cannot be covered by an index on two tables - we would fix "inside a table correlation problems" but not joins.
>
> Your proposed command didn't cover the two-table case either, and anyway
> what the heck do you mean by cross-correlation across tables?
> Cross-correlation is about the correlation between values in the same
> row.
>
>> b.) who says that there is actually an index in place?
>
> If the combination of columns is actually interesting, there might well
> be an index in place, or the DBA might be willing to create it. For
> that matter, have you considered the idea of examining the index
> contents to derive the statistics? Might work better than trying to get
> numbers via ANALYZE.
>
> regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2010-07-14 14:48:28 | standard_conforming_strings |
Previous Message | Robert Haas | 2010-07-14 14:39:50 | Re: five-key syscaches |