From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
Cc: | jeff(dot)janes(at)gmail(dot)com, alvherre(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: multivariate statistics v14 |
Date: | 2016-03-16 13:45:46 |
Message-ID: | 743161cc-771d-6e2b-c43f-4f2bd1fe6d42@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 03/16/2016 03:58 AM, Tatsuo Ishii wrote:
> I apology if it's already discussed. I am new to this patch.
>
>> Attached is v15 of the patch series, fixing this and also doing quite a
>> few additional improvements:
>>
>> * added some basic examples into the SGML documentation
>>
>> * addressing the objectaddress omissions, as pointed out by Alvaro
>>
>> * support for ALTER STATISTICS ... OWNER TO / RENAME / SET SCHEMA
>>
>> * significant refactoring of MCV and histogram code, particularly
>> serialization, deserialization and building
>>
>> * reworking the functional dependencies to support more complex
>> dependencies, with multiple columns as 'conditions'
>>
>> * the reduction using functional dependencies is also significantly
>> simplified (I decided to get rid of computing the transitive closure
>> for now - it got too complex after the multi-condition dependencies,
>> so I'll leave that for the future
>
> Do you have any other missing parts in this work? I am asking
> because I wonder if you want to push this into 9.6 or rather 9.7.
I think the first few parts of the patch series, namely:
* shared infrastructure (0002)
* functional dependencies (0003)
* MCV lists (0004)
* histograms (0005)
might make it into 9.6. I believe the code for building and storing the
different kinds of stats is reasonably solid. What probably needs more
thorough review are the changes in clauselist_selectivity(), but the
code in these parts is reasonably simple as it only supports using a
single multi-variate statistics per relation.
The part (0006) that allows using multiple statistics (i.e. selects
which of the available stats to use and in what order) is probably the
most complex part of the whole patch, and I myself do have some
questions about some aspects of it. I don't think this part might get
into 9.6 at this point (although it'd be nice if we managed to do that).
I can also imagine moving the ndistinct pieces forward, in front of 0006
if that helps getting it into 9.6. There's a bit more work on making it
more flexible, though, to allow handling subsets columns (currently we
need a perfect match).
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Korotkov | 2016-03-16 13:56:37 | Re: Declarative partitioning |
Previous Message | Tomas Vondra | 2016-03-16 13:32:00 | Re: multivariate statistics v14 |