Re: [PATCH] Add --syntax to postgres for SQL syntax checking

From: Josef Šimánek <josef(dot)simanek(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Add --syntax to postgres for SQL syntax checking
Date: 2024-05-15 18:12:30
Message-ID: CAFp7Qwr2wcHZdy++69LLS71YyCjp_3Q6j-nPxRaeEXcnSN3d8A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

st 15. 5. 2024 v 19:43 odesílatel Robert Haas <robertmhaas(at)gmail(dot)com> napsal:
>
> On Sun, Feb 25, 2024 at 5:24 PM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> > I think there's about 0% chance we'd add this to "postgres" binary.
>
> Several people have taken this position, so I'm going to mark
> https://commitfest.postgresql.org/48/4704/ as Rejected.
>
> My own opinion is that syntax checking is a useful thing to expose,
> but I don't believe that this is a useful way to expose it. I think
> this comment that Tom made upthread is right on target:
>
> # Another thing I don't like is that this exposes to the user what ought
> # to be purely an implementation detail, namely the division of labor
> # between gram.y (raw_parser()) and the rest of the parser. There are
> # checks that a user would probably see as "syntax checks" that don't
> # happen in gram.y, and conversely there are some things we reject there
> # that seem more like semantic than syntax issues.
>
> I think that what that means in practice is that, while this patch may
> seem to give reasonable results in simple tests, as soon as you try to
> do slightly more complicated things with it, it's going to give weird
> results, either failing to flag things that you'd expect it to flag,
> or flagging things you'd expect it not to flag. Fixing that would be
> either impossible or a huge amount of work depending on your point of
> view. If you take the point of view that we need to adjust things so
> that the raw parser reports all the things that ought to be reported
> by a tool like this and none of the things that it shouldn't, then
> it's probably just a huge amount of work. If you take the point of
> view that what goes into the raw parser and what goes into parse
> analysis ought to be an implementation decision untethered to what a
> tool like this ought to report, then fixing the problems would be
> impossible, or at least, it would amount to throwing away this patch
> and starting over. I think the latter point of view, which Tom has
> already taken, would be the more prevalent view among hackers by far,
> but even if the former view prevailed, who is going to do all that
> work?
>
> I strongly suspect that doing something useful in this area requires
> about two orders of magnitude more code than are included in this
> patch, and a completely different design. If it actually worked well
> to do something this simple, somebody probably would have done it
> already. In fact, there are already tools out there for validation,
> like https://github.com/okbob/plpgsql_check for example. That tool
> doesn't do exactly the same thing as this patch is trying to do, but
> it does do other kinds of validation, and it's 19k lines of code, vs.
> the 45 lines of code in this patch, which I think reinforces the point
> that you need to do something much more complicated than this to
> create real value.
>
> Also, the patch as proposed, besides being 45 lines, also has zero
> lines of comments, no test cases, no documentation, and doesn't follow
> the PostgreSQL coding standards. I'm not saying that to be mean, nor
> am I suggesting that Josef should go fix that stuff. It's perfectly
> reasonable to propose a small patch without a lot of research to see
> what people think -- but now we have the answer to that question:
> people think it won't work. So Josef should now decide to either give
> up, or try a new approach, or if he's really sure that all the
> feedback that has been given so far is completely wrong, he can try to
> demonstrate that the patch does all kinds of wonderful things with
> very few disadvantages. But I think if he does that last, he's going
> to find that it's not really possible, because I'm pretty sure that
> Tom is right.

I'm totally OK to mark this as rejected and indeed 45 lines were just
minimal patch to create PoC (I have coded this during last PGConf.eu
lunch break) and mainly to start discussion.

I'm not sure everyone in this thread understands the reason for this
patch, which is clearly my fault, since I have failed to explain. Main
idea is to make a tool to validate query can be parsed, that's all.
Similar to running "EXPLAIN query", but not caring about the result
and not caring about the DB structure (ignoring missing tables, ...),
just checking it was successfully executed. This definitely belongs to
the server side and not to the client side, it is just a tool to
validate that for this running PostgreSQL backend it is a "parseable"
query.

I'm not giving up on this, but I hear there are various problems to
explore. If I understand it well, just running the parser to query
doesn't guarantee the query is valid, since it can fail later for
various reasons (I mean other than missing table, ...). I wasn't aware
of that. Also exposing this inside postgres binary seems
controversial. I had an idea to expose parser result at SQL level with
a new command (similar to EXPLAIN), but you'll need running PostgreSQL
backend to be able to use this capability, which is against one of the
original ideas. On the otherside PostgreSQL exposes a lot of "meta"
functionality already and this could be a nice addition.

I'll revisit the discussion again and try to submit another try once
I'll get more context and experience.

Thanks everyone for constructive comments!

> --
> Robert Haas
> EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josef Šimánek 2024-05-15 18:13:11 Re: [PATCH] Add --syntax to postgres for SQL syntax checking
Previous Message Robert Haas 2024-05-15 18:00:12 Re: psql JSON output format