Re: Query plan for views and WHERE clauses, Luke is not using the index

From: kimaidou <kimaidou(at)gmail(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Query plan for views and WHERE clauses, Luke is not using the index
Date: 2017-08-22 07:24:12
Message-ID: CAMKXKO5xyQNZC5fk56JotNhNZZ50tN80AoTWG_bHr11HJA4gBA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Thanks a lot for your detailed explanation. I will try ASAP with no
DISTINCT ( we are quite sure it is not needed anyway ), and report back
here.

Michaël

2017-08-21 23:52 GMT+02:00 David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>:

> On 19 August 2017 at 04:46, kimaidou <kimaidou(at)gmail(dot)com> wrote:
> > When we call the WHERE on the view:
> >
> > EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
> > SELECT *
> > FROM "qgep"."vw_qgep_reach"
> > WHERE "progression_geometry" &&
> > st_makeenvelope(1728327.03249295568093657,8240789.
> 26074041239917278,1728608.10987572139129043,8240958.
> 16933418624103069,3949)
> >
> >
> > The query plan is "wrong", as PostgreSQL seems to consider it should do a
> > seq scan on the tables, and only afterwards filter with the WHERE:
> > https://explain.depesz.com/s/wXV
> >
> > The query takes about 1 second instead of less than 100ms.
> >
> > Do you have any hint on this kind of issue ?
>
> This is by design due to the DISTINCT ON() clause. Only quals which
> filter columns which are in the DISTINCT ON can be safely pushed down.
>
> Consider the following, where I've manually pushed the WHERE clause.
>
> postgres=# create table tt (a int, b int);
> CREATE TABLE
> postgres=# create index on tt (a);
> CREATE INDEX
> postgres=# insert into tt values(1,1),(1,2),(2,1),(2,2);
> INSERT 0 4
> postgres=# select * from (select distinct on (a) a,b from tt order by
> a,b) tt where b = 2;
> a | b
> ---+---
> (0 rows)
>
>
> postgres=# select * from (select distinct on (a) a,b from tt where b =
> 2 order by a,b) tt;
> a | b
> ---+---
> 1 | 2
> 2 | 2
> (2 rows)
>
> Note the results are not the same.
>
> If I'd done WHERE a = 2, then the planner would have pushed the qual
> down into the subquery.
>
> More reading in check_output_expressions() in allpaths.c:
>
> /* If subquery uses DISTINCT ON, check point 3 */
> if (subquery->hasDistinctOn &&
> !targetIsInSortList(tle, InvalidOid, subquery->distinctClause))
> {
> /* non-DISTINCT column, so mark it unsafe */
> safetyInfo->unsafeColumns[tle->resno] = true;
> continue;
> }
>
> The comment for point 3 reads:
>
> * 3. If the subquery uses DISTINCT ON, we must not push down any quals
> that
> * refer to non-DISTINCT output columns, because that could change the set
> * of rows returned. (This condition is vacuous for DISTINCT, because then
> * there are no non-DISTINCT output columns, so we needn't check. Note
> that
> * subquery_is_pushdown_safe already reported that we can't use volatile
> * quals if there's DISTINCT or DISTINCT ON.)
>
>
> --
> David Rowley http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Mariel Cherkassky 2017-08-22 14:23:24 query runs for more than 24 hours!
Previous Message Mark Kirkwood 2017-08-22 01:04:13 Re: Odd sudden performance degradation related to temp object churn