Re: Query result differences between PostgreSQL 17 vs 16

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Ronald Cruz <cruz(at)rentec(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org, "'Peter Ford'" <pford(at)rentec(dot)com>, "Aaron J(dot) Garcia" <agarcia(at)rentec(dot)com>
Subject: Re: Query result differences between PostgreSQL 17 vs 16
Date: 2025-02-21 23:50:58
Message-ID: 3736132.1740181858@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> On Fri, Feb 21, 2025 at 11:13:17AM -0500, Ronald Cruz wrote:
>> We have observed an issue after upgrading to PostgreSQL 17 that caused us to
>> roll back to 16. Some of our queries are returning what I believe to be
>> erroneous results.

> We have a known problem with composite types and NULL constraints in PG
> 17 that I think we are fixing in PG 18.

There's no composite type at hand here. I think the problem is an
erroneous deduction from a column NOT NULL constraint. I can
reproduce a faulty plan in the regression database with

explain (costs off)
select * from tenk1
left join int4_tbl i on (unique1 = f1)
left join customer on (i.f1 = cid)
left join int4_tbl j on cid is not null;

v16 produces

Hash Left Join
Hash Cond: (tenk1.unique1 = i.f1)
-> Seq Scan on tenk1
-> Hash
-> Nested Loop Left Join
Join Filter: (customer.cid IS NOT NULL)
-> Hash Right Join
Hash Cond: (customer.cid = i.f1)
-> Seq Scan on customer
-> Hash
-> Seq Scan on int4_tbl i
-> Materialize
-> Seq Scan on int4_tbl j

but HEAD produces

Hash Left Join
Hash Cond: (tenk1.unique1 = i.f1)
-> Seq Scan on tenk1
-> Hash
-> Nested Loop Left Join
-> Hash Right Join
Hash Cond: (customer.cid = i.f1)
-> Seq Scan on customer
-> Hash
-> Seq Scan on int4_tbl i
-> Materialize
-> Seq Scan on int4_tbl j

Note the lack of any IS NOT NULL test. I think the planner has
convinced itself that the not null constraint on customer.cid
makes that test redundant, despite the fact that what it is
testing is a post-outer-join value that most certainly could
be null.

"git bisect" fingers this commit:

b262ad440edecda0b1aba81d967ab560a83acb8a is the first bad commit
commit b262ad440edecda0b1aba81d967ab560a83acb8a
Author: David Rowley <drowley(at)postgresql(dot)org>
Date: Tue Jan 23 18:09:18 2024 +1300

Add better handling of redundant IS [NOT] NULL quals

I've not looked at the code, but I suspect that it is failing
to check varnullingrels before believing that it can trust
the applicability of table constraints.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Richard Guo 2025-02-22 01:01:13 Re: Query result differences between PostgreSQL 17 vs 16
Previous Message Bruce Momjian 2025-02-21 23:05:10 Re: Query result differences between PostgreSQL 17 vs 16