Re: Poor man's partitioned index .... not being used?

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Gunther <raj(at)gusw(dot)net>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Poor man's partitioned index .... not being used?
Date: 2019-03-22 12:56:57
Message-ID: CAKJS1f_mPfT6s8_9T+BkHJ=ST2hb9Y918qPjG2qBMNFmZ5oXPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, 22 Mar 2019 at 07:57, Gunther <raj(at)gusw(dot)net> wrote:
> foo=# PREPARE testplan(int) AS
> foo-# SELECT * FROM Test WHERE mod(id,2) = mod($1,2) AND id = $1;
> PREPARE
> foo=# EXPLAIN EXECUTE testplan(8934);
> QUERY PLAN
> --------------------------------------------------------------------------
> Index Only Scan using test_pk0 on test (cost=0.42..8.44 rows=1 width=4)
> Index Cond: (id = 8934)
> (2 rows)
>
> That's quite alright actually. Now the questions is, could we use this in a nested loop query plan? That's where I think it can't work:

Not really. In that case, the parameters were replaced with the
specified values (a.k.a custom plan). That happens for the first 5
executions of a prepared statement, and in this case likely the
planner will continue to use the custom plan since the generic plan
won't know that the partial index is okay to use and the plan costs
would likely go up enough that the custom plan would continue to be
favoured.

> foo=# SET enable_mergejoin TO off;
> SET
> foo=# EXPLAIN SELECT * FROM Test2 a LEFT OUTER JOIN Test b ON( mod(b.id,2) = mod(a.id,2) AND b.id = a.id) LIMIT 10;
> QUERY PLAN
> --------------------------------------------------------------------------------
> Limit (cost=0.00..102516.78 rows=10 width=8)
> -> Nested Loop Left Join (cost=0.00..38238760.24 rows=3730 width=8)
> Join Filter: ((b.id = a.id) AND (mod(b.id, 2) = mod(a.id, 2)))
> -> Seq Scan on test2 a (cost=0.00..54.30 rows=3730 width=4)
> -> Materialize (cost=0.00..9056.93 rows=388129 width=4)
> -> Seq Scan on test b (cost=0.00..5599.29 rows=388129 width=4)
> (6 rows)
>
> It looks like it doesn't want to evaluate the mod(a.id, 2) before it moves to the index query for the nested loop.

Whether partial indexes can be used are not is determined using only
quals that can be applied at the scan level. In this case your qual
is a join qual, and since no other qual exists that can be evaluated
at the scan level where the index can be used, then it's not
considered. In any case, nothing there guarantees that one of your
indexes will match all records. For it to work, both of you indexes
would have to be scanned. It's not clear why you think that would be
any better than scanning just one index. I imagine it would only ever
be a win if you could eliminate one of the index scans with some qual
that guarantees that the index can't contain any records matching your
query.

> I wonder if there was a way of marking such expressions as safe in the query, like suggesting a certain evaluation order, i.e.,
>
> SELECT * FROM Test2 a LEFT OUTER JOIN Test b ON(mod(b.id,2) = EVAL(mod(a.id,2)) AND b.id = a.id) LIMIT 10;
>
> It's OK though. It just goes to show that in a case like this, it is best to just go with the partitioned table anyway.

It sounds like you might want something like partition-wise join that
exists in PG11.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Gunther 2019-03-22 20:01:38 Re: EXPLAIN PLAN for DELETE CASCADE or DELETE not using pkey index despite EXPLAINing that it would?
Previous Message Tom Lane 2019-03-21 21:16:56 Re: EXPLAIN PLAN for DELETE CASCADE or DELETE not using pkey index despite EXPLAINing that it would?