Re: Why is this SELECT evaluated?

From: Erik Wienhold <ewie(at)ewie(dot)name>
To: Miles Elam <miles(dot)elam(at)productops(dot)com>, PG-General Mailing List <pgsql-general(at)postgresql(dot)org>
Subject: Re: Why is this SELECT evaluated?
Date: 2023-01-28 20:34:01
Message-ID: 1572287198.422215.1674938041031@office.mailbox.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> On 28/01/2023 20:29 CET Miles Elam <miles(dot)elam(at)productops(dot)com> wrote:
>
> Postgres v15
>
> Given this example of a conversion from a byte array to an int8
> masquerading as an "unsigned" int4
>
> SELECT (get_byte(bytes, byte_offset)::int8 << 24)
> | (get_byte(bytes, byte_offset + 1) << 16)
> | (get_byte(bytes, byte_offset + 2) << 8)
> | (get_byte(bytes, byte_offset + 3))
> FROM ( VALUES ('\x010000'::bytea, 0) ) b(bytes, byte_offset)
> WHERE length(bytes) >= (4 + byte_offset)
> ;
>
> Why does this error result?
>
> ERROR: index 3 out of valid range, 0..2
> SQL state: 2202E
>
> I was under the impression that if the WHERE clause evaluated to
> false, the SELECT clause would not be evaluated.

Yes, according to https://www.postgresql.org/docs/15/sql-select.html#id-1.9.3.172.7
the WHERE clause is evaluated before the SELECT list.

> Why is get_byte(...) ever run in the first place even though length(bytes)
> is 3?

Postgres also applies constant folding which can be observed in the execution
plan (I removed the 4th get_byte call). The WHERE clause is always false:

QUERY PLAN
------------------------------------------
Result (cost=0.00..0.01 rows=1 width=8)
One-Time Filter: false
(2 rows)

And with those constants the SELECT list is evaluated before the statement is
processed in the documented order.

Does the SQL standard say anything about constant folding and when or if it can
be applied? I assume it's just an implementation detail of Postgres. Without
the constant folding I would also expect that query to just return the empty set.

get_byte checks the index at runtime. Adding a fourth byte (index is 0-based)
works as expected with index 3:

test=# select get_byte('\x010203', 3);
ERROR: index 3 out of valid range, 0..2

test=# select get_byte('\x01020304', 3);
get_byte
----------
4
(1 row)

test=# select get_byte('\x01020304', 4);
ERROR: index 4 out of valid range, 0..3

With a random bytea length the query is processed in the expected order and
returns one or zero rows but never raises an error:

SELECT
b,
(get_byte(bytes, byte_offset)::int8 << 24)
| (get_byte(bytes, byte_offset + 1) << 16)
| (get_byte(bytes, byte_offset + 2) << 8)
| (get_byte(bytes, byte_offset + 3))
FROM (
VALUES (substring('\x01020304'::bytea from 1 for (random() * 4)::int), 0)
) b(bytes, byte_offset)
WHERE
length(bytes) >= (4 + byte_offset);

--
Erik

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Erik Wienhold 2023-01-28 20:39:31 Re: Sequence vs UUID
Previous Message Miles Elam 2023-01-28 19:29:13 Why is this SELECT evaluated?