Re: [PATCH] Add --syntax to postgres for SQL syntax checking

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Josef Šimánek <josef(dot)simanek(at)gmail(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Add --syntax to postgres for SQL syntax checking
Date: 2024-05-15 17:42:58
Message-ID: CA+TgmoaxJFVH=Xer5ewGC5WY-qr71sB8H_KP6mmx1Z=WGg_g7w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 25, 2024 at 5:24 PM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> I think there's about 0% chance we'd add this to "postgres" binary.

Several people have taken this position, so I'm going to mark
https://commitfest.postgresql.org/48/4704/ as Rejected.

My own opinion is that syntax checking is a useful thing to expose,
but I don't believe that this is a useful way to expose it. I think
this comment that Tom made upthread is right on target:

# Another thing I don't like is that this exposes to the user what ought
# to be purely an implementation detail, namely the division of labor
# between gram.y (raw_parser()) and the rest of the parser. There are
# checks that a user would probably see as "syntax checks" that don't
# happen in gram.y, and conversely there are some things we reject there
# that seem more like semantic than syntax issues.

I think that what that means in practice is that, while this patch may
seem to give reasonable results in simple tests, as soon as you try to
do slightly more complicated things with it, it's going to give weird
results, either failing to flag things that you'd expect it to flag,
or flagging things you'd expect it not to flag. Fixing that would be
either impossible or a huge amount of work depending on your point of
view. If you take the point of view that we need to adjust things so
that the raw parser reports all the things that ought to be reported
by a tool like this and none of the things that it shouldn't, then
it's probably just a huge amount of work. If you take the point of
view that what goes into the raw parser and what goes into parse
analysis ought to be an implementation decision untethered to what a
tool like this ought to report, then fixing the problems would be
impossible, or at least, it would amount to throwing away this patch
and starting over. I think the latter point of view, which Tom has
already taken, would be the more prevalent view among hackers by far,
but even if the former view prevailed, who is going to do all that
work?

I strongly suspect that doing something useful in this area requires
about two orders of magnitude more code than are included in this
patch, and a completely different design. If it actually worked well
to do something this simple, somebody probably would have done it
already. In fact, there are already tools out there for validation,
like https://github.com/okbob/plpgsql_check for example. That tool
doesn't do exactly the same thing as this patch is trying to do, but
it does do other kinds of validation, and it's 19k lines of code, vs.
the 45 lines of code in this patch, which I think reinforces the point
that you need to do something much more complicated than this to
create real value.

Also, the patch as proposed, besides being 45 lines, also has zero
lines of comments, no test cases, no documentation, and doesn't follow
the PostgreSQL coding standards. I'm not saying that to be mean, nor
am I suggesting that Josef should go fix that stuff. It's perfectly
reasonable to propose a small patch without a lot of research to see
what people think -- but now we have the answer to that question:
people think it won't work. So Josef should now decide to either give
up, or try a new approach, or if he's really sure that all the
feedback that has been given so far is completely wrong, he can try to
demonstrate that the patch does all kinds of wonderful things with
very few disadvantages. But I think if he does that last, he's going
to find that it's not really possible, because I'm pretty sure that
Tom is right.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-05-15 17:45:30 Re: Adding the extension name to EData / log_line_prefix
Previous Message Andres Freund 2024-05-15 17:07:58 Re: Adding the extension name to EData / log_line_prefix