Re: Thinking About Correlated Columns (again)

From: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: sthomas(at)optionshouse(dot)com, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Thinking About Correlated Columns (again)
Date: 2013-05-15 20:22:33
Message-ID: 5193EE89.1090406@archidevsys.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 16/05/13 03:52, Heikki Linnakangas wrote:
> On 15.05.2013 18:31, Shaun Thomas wrote:
>> I've seen conversations on this since at least 2005. There were even
>> proposed patches every once in a while, but never any consensus. Anyone
>> care to comment?
>
> Well, as you said, there has never been any consensus.
>
> There are basically two pieces to the puzzle:
>
> 1. What metric do you use to represent correlation between columns?
>
> 2. How do use collect that statistic?
>
> Based on the prior discussions, collecting the stats seems to be
> tricky. It's not clear for which combinations of columns it should be
> collected (all possible combinations? That explodes quickly...), or
> how it can be collected without scanning the whole table.
>
> I think it would be pretty straightforward to use such a statistic,
> once we have it. So perhaps we should get started by allowing the DBA
> to set a correlation metric manually, and use that in the planner.
>
> - Heikki
>
>
How about pg comparing actual numbers of rows delivered with the
predicted number - and if a specified threshold is reached, then
maintaining statistics? There is obviously more to it, such as: is this
a relevant query to consider & the size of the tables (no point in
attempting to optimise tables with only 10 rows for example).

Cheers,
Gavin

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Thomas Kellerer 2013-05-15 20:31:46 Re: Thinking About Correlated Columns (again)
Previous Message Victor Yegorov 2013-05-15 20:20:43 Re: Effect of the WindowAgg on the Nested Loop