From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
Cc: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: neqjoinsel versus "refresh materialized view concurrently" |
Date: | 2018-03-19 21:33:35 |
Message-ID: | 334.1521495215@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> writes:
> On Wed, Mar 14, 2018 at 2:56 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>> Is there any good way to make the regression tests fail if the plan reverts
>> to the bad one? The only thing I can think of would be to make the table
>> bigger so the regression tests becomes "noticeably slower", but that is
>> pretty vague and not user friendly to formally pass and just hope it is slow
>> enough for someone to investigate.
> I can't think of a good way. I guess it can still pick a nested loop
> if it thinks there'll only be a couple of loops. This patch tells it
> to pay attention to the total cost, not the startup cost, so as soon
> as it thinks there is more than a hand full of rows the quadratic cost
> will exceed the sort/merge's logarithmic cost.
Right. After further thought, the key point here is that in non-error
cases the query will produce no rows, meaning that it must be executed
to completion before we can be sure of that. But applying a LIMIT
encourages the planner to pick a fast-start (slow-completion) plan,
which is not going to be what we want. If in fact the query were going
to produce a lot of rows, and the planner could accurately predict that,
then maybe a LIMIT would be useful --- but there's no hope of estimates
on wholerowvar *= wholerowvar being accurate any time soon, let alone
correctly handling the correlation with ctid <> ctid. So the LIMIT
is just an invitation to trouble and we may as well remove it.
Committed that way. I also changed EXISTS(SELECT * ...) to
EXISTS(SELECT 1 ...), in hopes of saving a few microseconds of
parsing effort.
> Since I've had hash joins on the mind recently I couldn't help
> noticing that you can't get a hash join out of this query's "record
> image" based join qual (or even a regular row-based =).
Yeah, because there's no hash support for recordimage. So there's even
less reason to be worried about how smart the planner is for this query:
basically, it might do a nestloop for a very small number of rows, but
otherwise it's gonna have to go for a merge join.
My earlier thought that we might be able to skip the ANALYZE step
seems wrong, though. It's true that it does little for this query,
but the follow-on query to build a diff table can probably make
good use of the stats.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | David G. Johnston | 2018-03-19 21:39:03 | Re: Problems with Error Messages wrt Domains, Checks |
Previous Message | Jeremy Finzel | 2018-03-19 21:19:18 | Re: found xmin from before relfrozenxid on pg_catalog.pg_authid |