Re: Use unique index for longer pathkeys.

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: amit(dot)kapila16(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Use unique index for longer pathkeys.
Date: 2014-07-28 09:47:57
Message-ID: 20140728.184757.261051744.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

> > > I think there is one more disadvantage in the way current patch is
> > > done which is that you need to collect index path keys for all relations
> > > irrespective of whether they will be of any use to eliminate useless
> > > pathkeys from query_pathkeys. One trivial case that comes to mind is
> > > when there are multiple relations involved in query and ORDER BY is
> > > base on columns of only part of the tables involved in query.
> >
> > Like this?
> >
> > select x.a, x.b, y.b from x, y where x.a = y.a order by x.a, x.b;
> >
> > Equivalent class consists of (x.a=y.a) and (x.b), so index
> > pathkeys for i_y is (y.a.=x.a). As a result, no common primary
> > pathkeys found.
>
> I think it will find common pathkey incase you have an unique index
> on x.a (please see the example below), but currently I am not clear
> why there is a need for a common index path key in such a case to
> eliminate useless keys in ORDER BY, why can't we do it based
> on individual table's path key.
>
> Example:
>
> create table t (a int not null, b int not null, c int, d text);
> create unique index i_t_pkey on t(a, b);
> insert into t (select a % 10, a / 10, a, 't' from generate_series(0,
> 100000) a);
> analyze;
>
> create table t1 (a int not null, b int not null, c int, d text);
> create unique index i_t1_pkey_1 on t1(a);
> create unique index i_t1_pkey_2 on t1(a, b);
> insert into t1 (select a * 2, a / 10, a, 't' from generate_series(0,
> 100000) a);
> explain (costs off, analyze off) select * from t,t1 where t.a=t1.a order by
> t1.a,t1.b,t1.c,t1.d;
>
> QUERY PLAN
> ------------------------------------------
> Merge Join
> Merge Cond: (t.a = t1.a)
> -> Index Scan using i_t_pkey on t
> -> Index Scan using i_t1_pkey_1 on t1
> (4 rows)
>
> Here we can notice that there is no separate sort key in plan.

Sure,

> Now drop the i_t1_pkey_1 and check the query plan again.
>
> drop index i_t1_pkey_1;
> explain (costs off, analyze off) select * from t,t1 where t.a=t1.a order by
> t1.a,t1.b,t1.c,t1.d;
> QUERY PLAN
> ------------------------------------------------
> Sort
> Sort Key: t.a, t1.b, t1.c, t1.d
> -> Merge Join
> Merge Cond: (t.a = t1.a)
> -> Index Scan using i_t_pkey on t
> -> Index Scan using i_t1_pkey_2 on t1
> (6 rows)
>
> Can't above plan eliminate Sort Key even after dropping index
> (i_t1_pkey_1)?

My patch doesn't so since there no longer a 'common primary
pathkeys' in this query. Perhaps the query doesn't allow the sort
eliminated. Since a is no more a pkey, t1 can have dulicate rows
for the same a, so the joined relation also may have duplicte
values in the column a. Therefore the joined relation may be half
sorted only by the column a so the sort pathkeys cannot be
trimmed.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2014-07-28 10:26:20 Re: PL/PgSQL: EXIT USING ROLLBACK
Previous Message Marko Tiikkaja 2014-07-28 09:34:41 Re: PL/PgSQL: EXIT USING ROLLBACK