Re: Re: Query 'Bout A Bug.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: David Lloyd-Jones <david(dot)lloyd-jones(at)attcanada(dot)ca>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Re: Query 'Bout A Bug.
Date: 2000-07-13 03:46:03
Message-ID: 26940.963459963@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
>> The bug list includes the following:
>> a.. SELECT foo UNION SELECT foo is incorrectly simplified to SELECT foo
>> Wy is this simplification incorrect? I don't get it.

> Not sure. Maybe someone can comment.

UNION implies DISTINCT according to the spec. Thus correct output from
the first query will contain no duplicates. The "simplified" version
will produce duplicates if there are any in the table.

We get this case right:

regression=# explain select f1+1 from int4_tbl union select f1+2 from int4_tbl;

Unique (cost=2.27..2.29 rows=1 width=4)
-> Sort (cost=2.27..2.27 rows=10 width=4)
-> Append (cost=0.00..2.10 rows=10 width=4)
-> Seq Scan on int4_tbl (cost=0.00..1.05 rows=5 width=4)
-> Seq Scan on int4_tbl (cost=0.00..1.05 rows=5 width=4)

(EXPLAIN doesn't show the expressions being computed at each step, but
you can see a UNIQUE is getting done) but not this case:

regression=# explain select f1+1 from int4_tbl union select f1+1 from int4_tbl;

Seq Scan on int4_tbl (cost=0.00..1.05 rows=5 width=4)

Why, you ask? Because someone thought it'd be a clever idea to simplify
redundant UNION/INTERSECT/EXCEPT query trees by pretending they are
OR/AND/AND NOT boolean-expression trees and handing them to cnfify().
cnfify knows "x OR x" reduces to "x". Neat idea, too bad the semantics
aren't quite the same. But really it's a waste of planning cycles
anyway, since who'd be likely to write such a query in the first place?
I'm planning to rip out the whole foolishness when we redo querytrees
for 7.2.

It would also be incorrect to simplify SELECT foo UNION ALL SELECT foo,
btw, since this should produce all the tuples in foo twice. This one
we get right, although I don't recall offhand what prevents us from
blowing it --- cnfify() certainly wouldn't know the difference.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2000-07-13 04:20:10 Re: Some Improvement
Previous Message Philip Warner 2000-07-13 03:05:19 Re: Re: Query 'Bout A Bug.