From: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | tomas(dot)vondra(at)2ndquadrant(dot)com |
Cc: | pgsql-hackers(at)postgresql(dot)org, jeff(dot)janes(at)gmail(dot)com, sfrost(at)snowman(dot)net |
Subject: | Re: multivariate statistics / patch v7 |
Date: | 2015-07-03 05:30:34 |
Message-ID: | 20150703.143034.132752108.horiguchi.kyotaro@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello, I started to work on this patch.
> attached is v7 of the multivariate stats patch. The main improvement
> is major refactoring of the clausesel.c portion - splitting the
> awfully long spaghetti-style functions into smaller pieces, making it
> much more understandable etc.
Thank you, it looks clearer. I have some comment for the brief
look at this. This patchset is relatively large so I will comment
on "per-notice" basis.. which means I'll send comment before
examining the entire of this patchset. Sorry in advance for the
desultory comments.
=======
General comments:
- You included unnecessary stuffs such like regression.diffs in
these patches.
- Now OID 3307 is used by pg_stat_file. I moved
pg_mv_stats_dependencies_info/show to 3311/3312.
- Single-variate stats have a mechanism to inject arbitrary
values as statistics, that is, get_relation_stats_hook and the
similar stuffs. I want the similar mechanism for multivariate
statistics, too.
0001:
- I also don't think it is right thing for expression_tree_walker
to recognize RestrictInfo since it is not a part of expression.
0003:
- In clauselist_selectivity, find_stats is uselessly called for
single clause. This should be called after the clauselist found
to consist more than one clause.
- Searching vars to be compared with mv-stat columns which
find_stats does should stop at disjunctions. But this patch
doesn't behave so and it should be an unwanted behavior. The
following steps shows that.
====
=# CREATE TABLE t1 (a int, b int, c int);
=# INSERT INTO t1 (SELECT a, a * 2, a * 3 FROM generate_series(0, 9999) a);
=# EXPLAIN SELECT * FROM t1 WHERE a = 1 AND b = 2 OR c = 3;
Seq Scan on t1 (cost=0.00..230.00 rows=1 width=12)
=# ALTER TABLE t1 ADD STATISTICS (HISTOGRAM) ON (a, b, c);
=# ANALZYE t1;
=# EXPLAIN SELECT * FROM t1 WHERE a = 1 AND b = 2 OR c = 3;
Seq Scan on t1 (cost=0.00..230.00 rows=268 width=12)
====
Rows changed unwantedly.
It seems not so simple thing as your code assumes.
> I do assume some of those pieces are unnecessary because there already
> is a helper function with the same purpose (but I'm not aware of
> that). But IMHO this piece of code begins to look reasonable
> (especially when compared to the previous state).
Year, such kind of work should be done later:p This patch is
not-so-invasive so as to make it undoable.
> The other major improvement it review of the comments (including
> FIXMEs and TODOs), and removal of the obsolete / misplaced ones. And
> there was plenty of those ...
>
> These changes made this version ~20k smaller than v6.
>
> The patch also rebases to current master, which I assume shall be
> quite stable - so hopefully no more duplicate OIDs for a while.
>
> There are 6 files attached, but only 0002-0006 are actually part of
> the multivariate statistics patch itself. The first part makes it
> possible to use pull_varnos() with expression trees containing
> RestrictInfo nodes, but maybe this is not the right way to fix this
> (there's another thread where this was discussed).
As mentioned above, checking if mv stats can be applied would be
more complex matter than now you are assuming. I also will
consider that.
> Also, the regression tests testing plan choice with multivariate stats
> (e.g. that a bitmap index scan is chosen instead of index scan) fail
> from time to time. I suppose this happens because the invalidation
> after ANALYZE is not processed before executing the query, so the
> optimizer does not see the stats, or something like that.
I saw that occurs, but have no idea how it occurs so far..
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2015-07-03 05:34:44 | Re: WAL logging problem in 9.4.3? |
Previous Message | Amit Langote | 2015-07-03 05:23:58 | Re: [PROPOSAL] VACUUM Progress Checker. |