BUG #16749: EXPLAIN only shows filtering from the first query in a union chain when using distinct on.

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: mike22e(at)gmail(dot)com
Subject: BUG #16749: EXPLAIN only shows filtering from the first query in a union chain when using distinct on.
Date: 2020-11-27 11:25:32
Message-ID: 16749-7a879a6a752b0843@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 16749
Logged by: Michael Richards
Email address: mike22e(at)gmail(dot)com
PostgreSQL version: 12.2
Operating system: macOS Big Sur (11.0.1)
Description:

### Bug description:

The query plan produced by EXPLAIN ANALYZE does not show all filters being
applied when querying with distinct on over a union'd set of queries. It
only shows a filter as if there was just a single query (the first one in
the union chain). Confirmed in both Postgres 12 and 13.

### Example to reproduce:

Take the following query:

select * from generate_series(0,20) n1
left join (
select distinct on (id) * from (
select n2 as id, n2+2 as a from generate_series(0,10) n2
union all
select n3+5 as id, n3+12 as a from generate_series(0,10) n3
) s1 order by id, a
) s2 on s2.id = n1 where 20 is distinct from s2.a;

Which produces the correct results (row with column a = 20 is removed), as
shown below:

n1 | id | a
----+----+----
0 | 0 | 2
1 | 1 | 3
2 | 2 | 4
3 | 3 | 5
4 | 4 | 6
5 | 5 | 7
6 | 6 | 8
7 | 7 | 9
8 | 8 | 10
9 | 9 | 11
10 | 10 | 12
11 | 11 | 18
12 | 12 | 19
14 | 14 | 21
15 | 15 | 22
16 | |
17 | |
18 | |
19 | |
20 | |
(20 rows)

However, the row that contained a = 20 was produced using n3+12 in the
second query in the union, but the EXPLAIN ANALYZE for this query only shows
the n2+2 as a filter; the n3+12 is nowhere to be seen as a filter in the
query plan.

Hash Left Join (cost=1.52..1.79 rows=20 width=12) (actual
time=0.098..0.105 rows=20 loops=1)
Hash Cond: (n1.n1 = n2.n2)
Filter: (20 IS DISTINCT FROM ((n2.n2 + 2)))
Rows Removed by Filter: 1
-> Function Scan on generate_series n1 (cost=0.00..0.21 rows=21
width=4) (actual time=0.005..0.007 rows=21 loops=1)
-> Hash (cost=1.24..1.24 rows=22 width=8) (actual time=0.068..0.068
rows=16 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Unique (cost=0.91..1.02 rows=22 width=8) (actual
time=0.058..0.064 rows=16 loops=1)
-> Sort (cost=0.91..0.96 rows=22 width=8) (actual
time=0.058..0.059 rows=22 loops=1)
Sort Key: n2.n2, ((n2.n2 + 2))
Sort Method: quicksort Memory: 26kB
-> Append (cost=0.00..0.42 rows=22 width=8) (actual
time=0.010..0.024 rows=22 loops=1)
-> Function Scan on generate_series n2
(cost=0.00..0.14 rows=11 width=8) (actual time=0.009..0.011 rows=11
loops=1)
-> Function Scan on generate_series n3
(cost=0.00..0.17 rows=11 width=8) (actual time=0.009..0.011 rows=11
loops=1)

The relevant part of the query plan is lines 3-4.

Filter: (20 IS DISTINCT FROM ((n2.n2 + 2)))
Rows Removed by Filter: 1

The above filter did not actually remove any rows at all. The filter that
removed that 1 row should be Filter: (20 IS DISTINCT FROM ((n3.n3 + 12))).

This bug also applies when there are any number of union'd queries, not just
two—the query plan will still only show the filter from the first query in
the union chain. The query results seem to be correct, but the displayed
query plan does not account for it.

Cheers,
Michael

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Gustavsson Mikael 2020-11-27 12:56:53 Problem with pg_notify / listen
Previous Message PG Bug reporting form 2020-11-27 08:19:02 BUG #16748: postgis30_13 installation broken