Re: BUG #18399: Query plan optimization results in runtime error when hoisting cast from inside subquery

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: scknoblich(at)gmail(dot)com
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18399: Query plan optimization results in runtime error when hoisting cast from inside subquery
Date: 2024-03-19 15:47:26
Message-ID: 3877106.1710863246@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> My expectation was that the query should behave as if the CTE (or subquery
> if I had used that instead) has fully executed before moving on to the main
> query and this behavior breaks that assumption (which may or may not be
> correct). I attempted to search the docs for anything related to this,
> either CTE/subquery execution order guarantees or hoisting of casts, but I
> couldn't find anything about it. If this turns out not to a bug and is just
> expected behavior with existing documentation could you please link me to
> the relevant page for future reference?

You need to materialize the CTE to be certain that it won't be
intermingled with the calling query:

=# with raw_data(id, value) as (values (1, '1900'), (2, '2100'), (3, 'abc')),
integers as materialized (select value::integer
from raw_data
where id in (select id from raw_data where value ~ '^\d+$'))
select count(*)
from integers
where value < 2000;
count
-------
1
(1 row)

See
https://www.postgresql.org/docs/current/sql-select.html#SQL-WITH
(last few paras of that sub-section).

The real bottom line here is that the cast to integer is presumed
to be side-effect-free, which isn't true if you're working with
data where it could throw errors. My advice is to rethink your
data model: this sort of EAV approach with inconsistent data
types in the "same" column just does not play nice with SQL
optimizers, so it'll cause you ongoing heartburn. If you think
you're too far down the road for that, consider replacing the
cast with some non-error-throwing user-defined function,
about like

case when x ~ '^\d+$' then x::integer else null::integer end

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Masahiko Sawada 2024-03-19 16:42:15 Re: Potential data loss due to race condition during logical replication slot creation
Previous Message PG Bug reporting form 2024-03-19 15:31:37 BUG #18400: logging_collector does not collect messages from postmaster