Re: BUG #16585: Wrong filtering on a COALESCE field after using GROUPING SETS

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, pavelsivash(at)gmail(dot)com, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16585: Wrong filtering on a COALESCE field after using GROUPING SETS
Date: 2020-08-20 21:25:27
Message-ID: 1750212.1597958727@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> The problem here is that something is assuming that the country_id is
> still constant 1 despite its participation in grouping sets rendering it
> sometimes null.

Yeah. Your version of the query is initially simplified, by one level
of subquery pullup, into

select coalesce(country_id, city_id) AS location_id,
total
from (select 1 as country_id,
city_id,
sum(cnt) as total
from (values (2,5),(2,1),(3,2),(3,7)) as table1(city_id,cnt)
group by grouping sets (1,2)) base
where coalesce(country_id, city_id) = 1;

We can't pull up the remaining subquery because it has GROUP BY.
But what we will try to do instead is to push down the outer
WHERE clause into the subquery (cf. set_subquery_pathlist and
subroutines). That code sees no reason not to do so, so
it converts this into

select coalesce(country_id, city_id) AS location_id,
total
from (select 1 as country_id,
city_id,
sum(cnt) as total
from (values (2,5),(2,1),(3,2),(3,7)) as table1(city_id,cnt)
group by grouping sets (1,2)
having coalesce(1, city_id) = 1
) base;

and then const-folding proves the HAVING to be constant-true.

> Most likely, that constant column needs to either be treated as not
> constant, or something should be replacing it with a PHV - I'd have to
> dig into the code a bit to see what's actually going wrong.

PHVs don't save us here because those are only added when pulling up
a subquery, which is not what's happening.

As a stopgap measure, I think what we have to do is teach
check_output_expressions that subquery output columns are
unsafe to reference if they are not listed in all grouping
sets (do I have that condition right?).

The scheme I've been thinking about for clarifying the nullability
semantics of Vars might eventually provide a nicer answer for this,
but we haven't got it today.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andrew Gierth 2020-08-20 21:51:32 Re: BUG #16585: Wrong filtering on a COALESCE field after using GROUPING SETS
Previous Message Tom Lane 2020-08-20 19:18:24 Re: BUG #16272: Index expression can refer to wrong attributes if index is created via CREATE TABLE LIKE