Re: BUG #8396: Window function results differ when selecting from table and view, with where clause

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: paul(at)weotta(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #8396: Window function results differ when selecting from table and view, with where clause
Date: 2013-08-25 16:22:47
Message-ID: 20121.1377447767@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

paul(at)weotta(dot)com writes:
> When I select from a view, the where clause in my select statement does not
> restrict the rows processed by window functions referenced inside the view
> definition. Thus, if window functions are involved, using a where clause
> when selecting from a view and using a where clause when selecting directly
> from the underlying table produces different results.

I don't see anything even a little bit surprising about this. A WHERE
clause applied to a view should hide some rows that would appear in the
view output without it, but it should certainly never change the contents
of the rows it does let through. Now, if you put the WHERE clause before
the window functions (and WHERE is semantically prior to the evaluation of
a SELECT list at the same syntactic level), then the WHERE clause does
filter the rows before the window functions see 'em. But a WHERE clause
supplied at an outer syntactic level acts after the window functions are
processed.

Another way to put it is that your query with the view is equivalent to

select
*
from ( select
plant_id as plant_id
, row_number() over ( partition by city order by plant_id ) as plant_number
, count(*) over ( partition by city ) as of_plants_in_city
, city as city
from plant ) as plant_extend
where
plant_id = 'FOLSOM'
;

which is not at all the same thing as

select
plant_id as plant_id
, row_number() over ( partition by city order by plant_id ) as plant_number
, count(*) over ( partition by city ) as of_plants_in_city
, city as city
from plant
where
plant_id = 'FOLSOM'
;

precisely because the former specifies applying the WHERE filter after
the window functions run, while the latter specifies applying it before.

The fact that the results do change when you put the WHERE restriction
before the window functions is exactly why the optimization discussed
in that stackoverflow thread you mention isn't made. In some cases it
would theoretically be possible to prove that moving the filtering
wouldn't change the results, but Postgres isn't smart enough to do that
--- and even if it were, it would not push down the WHERE clause in either
this example or the stackoverflow one, because it could/would change the
results.

Or in short, this isn't a bug, but a counterexample to the stackoverflow
discussion.

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Magnus Hagander 2013-08-25 17:01:28 Re: BUG #8397: pg_basebackup -x from new standby server sometimes causes Segmentation fault
Previous Message TAKATSUKA Haruka 2013-08-25 07:05:49 Re: BUG #8397: pg_basebackup -x from new standby server sometimes causes Segmentation fault