Re: plpgsql variable assignment not supporting distinct anymore

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: easteregg(at)verfriemelt(dot)org
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: plpgsql variable assignment not supporting distinct anymore
Date: 2021-01-22 13:58:50
Message-ID: CAFj8pRDCz-Pz-oZ1bROJaRNzYA+7=8SN_BBYONaikyCCihZ0pA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

pá 22. 1. 2021 v 14:41 odesílatel <easteregg(at)verfriemelt(dot)org> napsal:

> the code provided is just a little poc to get the error ( which i have not
> included with my first mail sorry. )
>
> ERROR: syntax error at or near "DISTINCT"
> LINE 8: _test := DISTINCT a FROM ( VALUES ( (true), ( true ) ) )...
>
>
> the code in production looked like this:
>
>
> _resource_id :=
> DISTINCT ti_resource_id
> FROM tabk.resource_timeline
> WHERE ti_a2_id = _ab2_id
> AND ti_type = 'task'
> ;
>
> this is backed up by a trigger function, that will ensure to every
> instance with the same ti_a2_id exists only one ti_resource_id, hence the
> query can never fail due to more than one row beeing returned. but this
> syntax is not supported anymore, which will break BC. up until PG 13, the
> assignment statement was just an implizit SELECT <expression> Query.
> Since Tom Lane didn't mentioned this change in the other thread, i figured
> the devteam might not be aware of this chance.
>
> i can refactor this line into
>
> _resource_id :=
> ti_resource_id
> FROM tabk.resource_timeline
> WHERE ti_a2_id = _ab2_id
> AND ti_type = 'task'
> GROUP BY ti_resource_id
> ;
>
> but concerns about BC was already raised, although with UNION there might
> be far less people affected.
> with kind regards, richard
>

Probably the fix is not hard, but it is almost the same situation as the
UNION case. The result of your code is not deterministic

If there are more different ti_resource_id then some values can be randomly
ignored - when hash agg is used.

The safe fix should be

_resource_id := (SELECT ti_resource_id
FROM tabk.resource_timeline
WHERE ti_a2_id = _ab2_id
AND ti_type = 'task');

and you get an exception if some values are ignored. Or if you want to
ignore some values, then you can write

_resource_id := (SELECT MIN(ti_resource_id) -- or MAX
FROM tabk.resource_timeline
WHERE ti_a2_id = _ab2_id
AND ti_type = 'task');

Using DISTINCT is not a good solution.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2021-01-22 14:06:01 Re: mkid reference
Previous Message Heikki Linnakangas 2021-01-22 13:50:04 Re: Some more hackery around cryptohashes (some fixes + SHA1)