From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "Sheng Y(dot) Cheng" <scheng(at)adconion(dot)com>, pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #5025: Aggregate function with subquery in 8.3 and 8.4. |
Date: | 2009-09-02 00:34:45 |
Message-ID: | 13466.1251851685@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
I blithely opined:
> I believe BTW that there are related issues in other places where we
> expand composites into RowExprs. But the other places have been doing
> that for awhile. I think that for 8.4 our goals should be limited to
> not changing the behavior compared to prior releases.
So while I was testing my fix for this, I found out that that's more
complicated than I thought. Consider these examples in the regression
database:
select t1.q2, count(t2.*)
from int8_tbl t1 left join int8_tbl t2 on (t1.q2 = t2.q1)
group by t1.q2 order by 1;
select t1.q2, count(t2.*)
from int8_tbl t1 left join (select * from int8_tbl) t2 on (t1.q2 = t2.q1)
group by t1.q2 order by 1;
select t1.q2, count(t2.*)
from int8_tbl t1 left join (select * from int8_tbl offset 0) t2 on (t1.q2 = t2.q1)
group by t1.q2 order by 1;
select t1.q2, count(t2.*)
from int8_tbl t1 left join
(select q1, case when q2=1 then 1 else q2 end as q2 from int8_tbl) t2
on (t1.q2 = t2.q1)
group by t1.q2 order by 1;
If you believe that "t2.*" should go to NULL in a join-extended row,
then the correct answer for all four of these is
q2 | count
-------------------+-------
-4567890123456789 | 0
123 | 2
456 | 0
4567890123456789 | 6
(4 rows)
However, the actual behavior of every release since 8.0 has been that
the second case gives
q2 | count
-------------------+-------
-4567890123456789 | 1
123 | 2
456 | 1
4567890123456789 | 6
(4 rows)
ie, t2.* fails to go to NULL because it's expanded as ROW(t2.q1,t2.q2).
The OFFSET 0 in the third case restores expected behavior by preventing
flattening of the subquery, and up till 8.4 the CASE expression in the
fourth case did too.
With the fix I was just about to apply, all four cases give the first
set of results. This clearly satisfies the principle of least
astonishment, at least more nearly than what we have; but it equally
clearly is *not* going to restore 8.4 to work just like 8.3.
I'm inclined to apply the patch to 8.4 anyway, because it seems like a
bug fix. I would consider patching further back except there's no
chance of making it work in older branches, at least not without
destabilizing them quite a bit (the PlaceHolderVar mechanism would have
to be back-ported).
It might be possible to fix the older branches by not flattening
subqueries that have whole-row references; but even that would take
nontrivial work, and it'd be sacrificing performance to fix a corner
case no one has previously complained about. So I'm leaning to patching
8.4 and leaving the older branches alone.
Thoughts?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2009-09-02 02:01:31 | Re: pg_ctl infinite loop and memory leak |
Previous Message | Jeff Davis | 2009-09-01 23:48:06 | Re: BUG #5028: CASE returns ELSE value always when type is "char" |