Re: [PATCH] TODO “Allow LISTEN on patterns”

From: Alexander Cheshev <alex(dot)cheshev(at)gmail(dot)com>
To: Emanuel Calvo <3manuek(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] TODO “Allow LISTEN on patterns”
Date: 2024-07-28 15:17:23
Message-ID: CAN_hQmt62SPeAgw-3bGkrr0kHQAgiN_-uGvgqkrq8Fh3gZS2Hw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Emanuel,

I did a review on the new patch version and I observed that the identifier
> passed to the LISTEN command is handled differently between outer and
> inner
> levels.
>

We have the following grammar:

notify_channel:
ColId
{ $$ = $1; }
| notify_channel '.' ColId
{ $$ = psprintf("%s.%s", $1, $3); }

And ColId is truncated in core scanner:

ident = downcase_truncate_identifier(yytext, yyleng, true);

So each level is truncated independently. For this reason we observe the
behaviour which you described above.

Another observation, probably not strictly related to this patch itself but
> the async-notify tests, is that there is no test for
> "payload too long". Probably there is a reason on why isn't in the specs?
>

I believe that simply because not all functionality is covered with tests.
But I have noticed a very interesting test "channel name too long":

SELECT
pg_notify('notify_async_channel_name_too_long______________________________','sample_message1');
ERROR: channel name too long

But the behaviour is inconsistent with NOTIFY command:

NOTIFY notify_async_channel_name_too_long______________________________
NOTICE: identifier
"notify_async_channel_name_too_long______________________________" will be
truncated to ...

I guess that the expected behavior would be that if the outer level is
> truncated, the rest of the
> channel name should be ignored, as there won't be possible to notify it
> anyway.
>
> In the case of the inner levels creating a channel name too long, it may
> probably sane to just
> check the length of the entire identifier, and truncate -- ensuring that
> channel name doesn't
> end with the level separator.
>
>
Well, I believe that we can forbid too long channel names at all. So it
provides consistent behaviour among different ways we can send
notifications, and I agree with you that "there won't be possible to notify
it anyway". I created a patch for that and attached it to the email. In the
patch I relocated truncation from core scanner to parser. And as the same
core scanner is also used in plsql I added three lines of code to its
scanner to basically truncate too long identifiers in there. Here is an
example of the new behaviour:

-- Should fail. Too long channel names
NOTIFY notify_async_channel_name_too_long_________._____________________;
ERROR: channel name too long
LISTEN notify_async_channel_name_too_long_________%._____________________;
ERROR: channel name too long
UNLISTEN notify_async_channel_name_too_long_________%._____________________;
ERROR: channel name too long

Regards,
Alexander Cheshev

On Sun, 21 Jul 2024 at 21:36, Emanuel Calvo <3manuek(at)gmail(dot)com> wrote:

>
> Hi Alexander,
>
> I did a review on the new patch version and I observed that the identifier
> passed to the LISTEN command is handled differently between outer and
> inner
> levels.
>
> When the outer level exceeds the 64 characters limitation, the outer level
> of the
> channel name is truncated, but leaves the inner levels in the channel name
> due
> that isn't parsed in the same way.
>
> Also, even if the outer level isn't truncated, it is allowed to add
> channels names
> that exceeds the allowed identifier size.
>
> It can be reproduced just by:
>
> # LISTEN a.a.a.a.a.lot.of.levels..; -- this doesn't fail at LISTEN,
> but fails in NOTIFY due to channel name too long
>
> In the following, the outer level is truncated, but it doesn't cut out the
> inner levels. This leaves
> listening channels that cannot receive any notifications in the queue:
>
> # LISTEN
> notify_async_channel_name_too_long____________________________________.a.a.
> ...
> NOTICE: identifier .... will be truncated
>
> # select substring(c.channel,0,66), length(c.channel) from
> pg_listening_channels() c(channel) where length(c.channel) > 64;
> substring |
> notify_async_channel_name_too_long_____________________________.a
> length | 1393
>
>
> I guess that the expected behavior would be that if the outer level is
> truncated, the rest of the
> channel name should be ignored, as there won't be possible to notify it
> anyway.
>
> In the case of the inner levels creating a channel name too long, it may
> probably sane to just
> check the length of the entire identifier, and truncate -- ensuring that
> channel name doesn't
> end with the level separator.
>
> Another observation, probably not strictly related to this patch itself
> but the async-notify tests, is that there is no test for
> "payload too long". Probably there is a reason on why isn't in the specs?
>
>
> Regards,
>
>
> El lun, 15 jul 2024 a las 12:59, Alexander Cheshev (<
> alex(dot)cheshev(at)gmail(dot)com>) escribió:
>
>> Hi Emanuel,
>>
>> Changed implementation of the function Exec_UnlistenCommit . v2 of the
>> path contained a bug in the function Exec_UnlistenCommit (added a test case
>> for that) and also it was not implemented in natural to C form using
>> pointers. Now it looks fine and works as expected.
>>
>> In the previous email I forgot to mention that the new implementation of
>> the function Exec_UnlistenCommit has the same space and time complexities
>> as the original implementation (which doesn't support wildcards).
>>
>> Regards,
>> Alexander Cheshev
>>
>>
>> On Sat, 13 Jul 2024 at 13:26, Alexander Cheshev <alex(dot)cheshev(at)gmail(dot)com>
>> wrote:
>>
>>> Hi Emanuel,
>>>
>>> I did a test over the "UNLISTEN >" behavior, and I'm not sure if this is
>>>> expected.
>>>> This command I assume should free all the listening channels, however,
>>>> it doesn't
>>>> seem to do so:
>>>
>>>
>>> TODO “Allow LISTEN on patterns” [1] is a bit vague about that feature.
>>> So I didn't implement it in the first version of the patch. Also I see that
>>> I made a mistake in the documentation and mentioned that it is actually
>>> supported. Sorry for the confusion.
>>>
>>> Besides obvious reasons I think that your finding is especially
>>> attractive for the following reason. We have an UNLISTEN * command. If we
>>> replace > with * in the patch (which I actually did in the new version of
>>> the patch) then we have a generalisation of the above command. For example,
>>> UNLISTEN a* cancels registration on all channels which start with a.
>>>
>>> I attached to the email the new version of the patch which supports the
>>> requested feature. Instead of > I use * for the reason which I mentioned
>>> above. Also I added test cases, changed documentation, etc.
>>>
>>> I appreciate your work, Emanuel! If you have any further findings I will
>>> be glad to adjust the patch accordingly.
>>>
>>> [1]
>>> https://www.postgresql.org/message-id/flat/52693FC5.7070507%40gmail.com
>>>
>>> Regards,
>>> Alexander Cheshev
>>>
>>> Regards,
>>> Alexander Cheshev
>>>
>>>
>>> On Tue, 9 Jul 2024 at 11:01, Emanuel Calvo <3manuek(at)gmail(dot)com> wrote:
>>>
>>>>
>>>> Hello there,
>>>>
>>>>
>>>> El vie, 15 mar 2024 a las 9:01, Alexander Cheshev (<
>>>> alex(dot)cheshev(at)gmail(dot)com>) escribió:
>>>>
>>>>> Hello Hackers,
>>>>>
>>>>> I have implemented TODO “Allow LISTEN on patterns” [1] and attached
>>>>> the patch to the email. The patch basically consists of the following
>>>>> two parts.
>>>>>
>>>>> 1. Support wildcards in LISTEN command
>>>>>
>>>>> Notification channels can be composed of multiple levels in the form
>>>>> ‘a.b.c’ where ‘a’, ‘b’ and ‘c’ are identifiers.
>>>>>
>>>>> Listen channels can be composed of multiple levels in the form ‘a.b.c’
>>>>> where ‘a’, ‘b’ and ‘c’ are identifiers which can contain the following
>>>>> wildcards:
>>>>> * ‘%’ matches everything until the end of a level. Can only appear
>>>>> at the end of a level. For example, the notification channels ‘a.b.c’,
>>>>> ‘a.bc.c’ match against the listen channel ‘a.b%.c’.
>>>>> * ‘>’ matches everything to the right. Can only appear at the end of
>>>>> the last level. For example, the notification channels ‘a.b’, ‘a.bc.d’
>>>>> match against the listen channel ‘a.b>’.
>>>>>
>>>>>
>>>> I did a test over the "UNLISTEN >" behavior, and I'm not sure if this
>>>> is expected.
>>>> This command I assume should free all the listening channels, however,
>>>> it doesn't
>>>> seem to do so:
>>>>
>>>> postgres=# LISTEN device1.alerts.%;
>>>> LISTEN
>>>> postgres=# ;
>>>> Asynchronous notification "device1.alerts.temp" with payload "80"
>>>> received from server process with PID 237.
>>>> postgres=# UNLISTEN >;
>>>> UNLISTEN
>>>> postgres=# ; -- Here I send a notification over the same channel
>>>> Asynchronous notification "device1.alerts.temp" with payload "80"
>>>> received from server process with PID 237.
>>>>
>>>> The same happens with "UNLISTEN %;", although I'm not sure if this
>>>> should have
>>>> the same behavior.
>>>>
>>>> It stops listening correctly if I do explicit UNLISTEN (exact channel
>>>> matching).
>>>>
>>>> I'll be glad to conduct more tests or checks on this.
>>>>
>>>> Cheers,
>>>>
>>>>
>>>> --
>>>> --
>>>> Emanuel Calvo
>>>> Database Engineering
>>>> https://tr3s.ma/aobut
>>>>
>>>>
>
> --
> --
> Emanuel Calvo
> https://tr3s.ma/ <https://tr3s.ma/about>
>
>

Attachment Content-Type Size
v4-0001-Support-wildcards-in-LISTEN-command.patch text/x-patch 56.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-07-28 15:50:33 Re: race condition in pg_class
Previous Message Junwang Zhao 2024-07-28 14:49:47 Re: Make COPY format extendable: Extract COPY TO format implementations