From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Simon Riggs <simon(at)2ndquadrant(dot)com>, Christian Schröder <cs(at)deriva(dot)de>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [GENERAL] Performance of full outer join in 8.3 |
Date: | 2009-04-16 00:58:18 |
Message-ID: | 603c8f070904151758w6af25641xac831b4cb71c4184@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
On Wed, Apr 15, 2009 at 7:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> We could add some regression tests that create a sample data set,
>> ANALYZE it, and then EXPLAIN various things. The results should be
>> deterministic,
>
> Sorry, you're wrong.
>
> The output of EXPLAIN is nowhere near stable enough to use within the
> current exact-match regression test framework. I'm not sure it would
> be stable even if we suppressed the rowcount and cost figures. Those
> figures vary across platforms (because of alignment effects and probably
> other things) and are also sensitive to the timing of autovacuums. It
> is known that a nontrivial fraction of the existing regression test
> cases do suffer from uninteresting plan changes across platforms or
> as a result of various phase-of-the-moon effects; that's why we keep
> having to add "ORDER BY" clauses now and then.
Interesting. I suppose you could insulate yourself from this somewhat
by populating pg_statistic with a particular set of values rather than
relying on ANALYZE to gather them, but this would have the substantial
downside of being way more work to maintain, especially if anyone ever
changed pg_statistic.
On a more practical level, I do think we need to give real
consideration to some kind of options syntax for EXPLAIN, maybe
something as simple as:
EXPLAIN (option_name, ...) query
Or maybe:
EXPLAIN (option_name = value, ...) query
It may or may not be the case that generating a useful regression test
suite for the planner is too much work for anyone to bother, but they
certainly won't if the tools aren't available. It seems we get at
least one request a month for some kind of explain-output option:
suppress row counts, suppress costs, gather I/O statistics, show
outputs, show # of batches for a hash join, and on and on and on. I
think we should implement a very basic version that maybe does nothing
more than let you optionally suppress some of the existing output, but
which provides an extensible syntax for others to build on.
Would you support such a change?
> The other problem with any large set of such tests is that any time you
> intentionally change the optimizer, a great deal of careful analysis
> would be needed to determine if the resulting EXPLAIN changes were good,
> bad, or indifferent; not to mention whether the change *should* have
> changed some plans that did not change.
Arguably it would be a good thing to examine planner changes with this
level of scrutiny, but I agree that the prospect is pretty
intimidating.
> There might be net value in maintaining such a test suite, but it would
> be a lot of work with no certain benefit, and I don't see anyone
> stepping up to do it.
...Robert
From | Date | Subject | |
---|---|---|---|
Next Message | mrLami | 2009-04-16 00:59:57 | Re: PgAdmin & PosgreSQL Plus Setup |
Previous Message | Tom Lane | 2009-04-15 23:39:51 | Re: [GENERAL] Performance of full outer join in 8.3 |
From | Date | Subject | |
---|---|---|---|
Next Message | David Fetter | 2009-04-16 02:41:42 | Re: psql with "Function Type" in \df |
Previous Message | Alvaro Herrera | 2009-04-16 00:54:10 | Re: psql with "Function Type" in \df |