Re: Early WIP/PoC for inlining CTEs

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Andreas Karlsson <andreas(at)proxel(dot)se>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Fetter <david(at)fetter(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Early WIP/PoC for inlining CTEs
Date: 2019-01-11 16:12:25
Message-ID: CA+Tgmob=Dpje+oGCYgJCoUok=iPQarGjPqRLYS1qKzChkFfKcA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I know this is a thorny topic, but I have to say that I am uneasy
> about the MATERIALIZED syntax. Here's how you write that in some
> other RDBMS that loves hints:
>
> WITH foo AS (SELECT /*+ MATERIALIZE */ ...)
>
> I understood that it was a long standing project policy that we don't
> want planner hints, but now we have a proposal to support one with a
> top-level non-standard syntax. If we take this syntax, should we not
> also accept MATERIALIZED in front of subselects?
>
> -1

I think this is unduly negative. Do you want to also remove the
IMMUTABLE, STABLE, and VOLATILE keywords from functions because those
are hints to the planner as to how those things should be treated?
Should we remove CREATE INDEX -- which is, after all, a non-standard
extension to SQL syntax -- because it presumes that the user is in a
better position than we are to know which columns ought to be indexed?

OK, I know that's a bit of a straw man -- you're talking about hints
within a query, not DDL. Still, I think our theory about not having
hints is that we should have the optimizer try to figure it out
instead of making the user specify the behavior that they want -- and
I think sometimes that's setting the bar at an impossible level. In
the case of things that have side effects, like FOR UPDATE/SHARE or
volatile functions, we really can't know the user's intention unless
the user tells us. But even in cases where there are no side effects
and it's just a question of finding the most efficient plan, it
doesn't seem crazy to me to allow the user to give us a clue about
what they have in mind.

It's not like this is a thing where we can get this right 90% of the
time and with some more planner work we can drive it up to near 100%.
We're going to be wrong a lot, even if we do expensive things like try
planning it both ways and see which one comes out cheaper on cost, and
we don't like driving up planner CPU consumption, either. So it seems
to me that letting the user say what they want is a very pragmatic
approach. Sometimes, like with things that have side effects, it's
the only way to know that we're even delivering the right answer; and
even when it's just an optimization problem, it's nice to give users a
way of fixing our planning failures that is better than asking them to
wait until we invent a way of improving the optimization decision in
some future release - which we may never even do.

I actually think that we should go "all in" here and allow the user to
specify either that they want materialization or that they don't want
materialization. If they specify neither, then we make some decision
which we may change in a future release. If they do specify
something, we do that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-01-11 16:17:27 Re: Misleading panic message in backend/access/transam/xlog.c
Previous Message Robert Haas 2019-01-11 15:33:48 Re: Acceptable/Best formatting of callbacks (for pluggable storage)