Quick Links

Planner chooses multi-column index in 9.2 when maybe it should not

From:	Greg Sabino Mullane <greg(at)endpoint(dot)com>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Planner chooses multi-column index in 9.2 when maybe it should not
Date:	2012-10-10 17:06:55
Message-ID:	20121010170631.GK9910@tinybird.home
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

We are seeing a performance regression when moving to 9.2. There is a
complex query that is doing a self-join, but long story short, it
is choosing to use a multi-column index when it really ought not to be.
I was not able to develop a standalone test case without resorting
to changing enable_seqscan, but this does show the difference:

CREATE TABLE gregtest (a TEXT PRIMARY KEY, b TEXT);
CREATE INDEX gregtest_i ON gregtest(b,a);
SET enable_seqscan = off;
EXPLAIN SELECT 1 FROM gregtest WHERE a <> 'panda' AND b <> 'yak';

On versions 8.2, 8.3, 8.4, 9.0, and 9.1, this gives:

Seq Scan on gregtest (cost=10000000000.00..10000000022.90 rows=855 width=0)
Filter: ((a <> 'panda'::text) AND (b <> 'yak'::text))

Which makes sense - I would imagine that b = 'yak' would use the index,
but the negation means the index is not very useful?

However, on 9.2, this gives:

Bitmap Heap Scan on gregtest (cost=8.76..31.66 rows=855 width=0)
Filter: ((a <> 'panda'::text) AND (b <> 'yak'::text))
-> Bitmap Index Scan on gregtest_i (cost=0.00..8.55 rows=860 width=0)

The above was tested on stock versions of Postgres, with no changes
made to postgresql.conf. In the actual query, the result is something like
this on 9.2 (columns explained below):

Nested Loop (cost=0.00..6050226723847.12 rows=282638194054762 width=8)
Join Filter: ((o.foo)::text <> ((m.foo)::text || 'panda'::text))
-> Index Only Scan using index_i on foobar m (cost=0.00..889187.83 rows=16998032 width=8)
-> Materialize (cost=0.00..1079773.42 rows=16627702 width=8)
-> Index Only Scan using index_i on foobar o (cost=0.00..931682.91 rows=16627702 width=8)
Filter: ((status)::text <> 'split'::text)

But like this on 8.3:

Nested Loop (cost=1003294.60..8207409555713.15 rows=283931552087940 width=8)
Join Filter: ((o.foo)::text <> ((m.foo)::text || 'panda'::text))
-> Seq Scan on foobar o (cost=0.00..962314.95 rows=16672204 width=8)
Filter: ((status)::text <> 'split'::text)
-> Materialize (cost=1003294.60..1240121.96 rows=17030236 width=8)
-> Seq Scan on foobar m (cost=0.00..919739.36 rows=17030236 width=8)

In the above, foobar has a primary key on foo, and an index named
index_i on foobar(status, foo). In another variation of the query,
8.3 uses foobar_pkey as well, rather than index_i, and filters that.
Matter of fact, index_i is never used.

At any rate, the change causes the original query to run much, much
slower. Problem on 9.2? Something wrong with our system and/or query?
More information needed from me?

--
Greg Sabino Mullane greg(at)endpoint(dot)com
End Point Corporation
PGP Key: 0x14964AC8

Responses

Re: Planner chooses multi-column index in 9.2 when maybe it should not at 2012-10-10 17:31:29 from Tom Lane

Browse pgsql-general by date

	From	Date	Subject
Next Message	Tom Lane	2012-10-10 17:31:29	Re: Planner chooses multi-column index in 9.2 when maybe it should not
Previous Message	Chris Ernst	2012-10-10 16:35:06	Re: pg_upgrade not detecting version properly