Re: Invalid "trailing junk" error message when non-English letters are used

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Karina Litskevich <litskevichkarina(at)gmail(dot)com>
Cc: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Invalid "trailing junk" error message when non-English letters are used
Date: 2024-09-04 22:52:21
Message-ID: 1709057.1725490341@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Karina Litskevich <litskevichkarina(at)gmail(dot)com> writes:
> I see the two solutions here: either move the rule for decinteger_junk
> below the rules for hexinteger_junk, octinteger_junk and bininteger_junk,
> or just use a single rule decinteger_junk for all these cases, since the
> error message is the same anyway. I implemented the latter in the second
> version of the patch, also renamed this common rule to integer_junk.

That seems reasonable, but IMO this code was unacceptably
undercommented before and what you've done has made it worse.
We really need a comment block associated with the flex macros,
perhaps along the lines of

/*
* An identifier immediately following a numeric literal is disallowed
* because in some cases it's ambiguous what is meant: for example,
* 0x1234 could be either a hexinteger or a decinteger "0" and an
* identifier "x1234". We can detect such problems by seeing if
* integer_junk matches a longer substring than any of the XXXinteger
* patterns. (One "junk" pattern is sufficient because this will match
* all the same strings we'd match with {hexinteger}{identifier} etc.)
* Note that the rule for integer_junk must appear after the ones for
* XXXinteger to make this work correctly.
*/

(Hmm, actually, is that last sentence true? My flex is a bit rusty.)

param_junk really needs a similar comment, or maybe we could put
all the XXX_junk macros together and use one comment for all.

> Additionally, I noticed that this patch is going to change error messages
> in some cases, though I don't think it's a big deal. Example:
> Without patch:
> postgres=# select 0xyz;
> ERROR: invalid hexadecimal integer at or near "0x"
> With patch:
> postgres=# select 0xyz;
> ERROR: trailing junk after numeric literal at or near "0xyz"

That's sort of annoying, but I don't really see a better way,
or at least not one that's worth the effort.

>> FWIW output of the whole string in the error message doesnt' look nice to
>> me, but other places of code do this anyway e.g:
>> select ('1'||repeat('p',1000000))::integer;
>> This may be worth fixing.

I think this is nonsense: we are already in the habit of repeating the
whole failing query string in the STATEMENT field. In any case it's
not something for this patch to worry about.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jehan-Guillaume de Rorthais 2024-09-04 22:57:28 Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Previous Message Daniel Gustafsson 2024-09-04 22:10:17 Re: [PATCH] Avoid mixing custom and OpenSSL BIO functions