Quick Links

Re: Strange (?) Index behavior?

From:	Allen Landsidel <alandsidel(at)gmail(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Strange (?) Index behavior?
Date:	2004-11-05 22:40:23
Message-ID:	88f1825a04110514407ef8a11b@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Fri, 05 Nov 2004 16:08:56 -0500, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Allen Landsidel <alandsidel(at)gmail(dot)com> writes:
> > With seqscan enabled however, "AB%" will use the index, but "A%" will not.
>
> > The estimated cost for the query is much higher without the partial
> > indexes than it is with them, and the actual runtime of the query is
> > definitely longer without the partial indexes.
>
> OK. This suggests that the planner is drastically misestimating
> the selectivity of the 'A%' clause, which seems odd to me since in
> principle it could get that fairly well from the ANALYZE histogram.
> But it could well be that you need to increase the resolution of the
> histogram --- see ALTER TABLE SET STATISTICS.

I will look into this.

>
> Did you ever show us EXPLAIN ANALYZE results for this query?

No, I didn't. I am running it now without the partial index on to
give you the results but it's (the 'A%' problem query) been running
pretty much since I got this message (an hour ago) and is still not
finished.

The EXPLAIN results without the ANALYZE will have to suffice until
it's done, I can readd the index, and run it again, so you have both
to compare to.

First two queries run where both the main index, and the 'A%' index exist:

-- QUERY 1
search=# explain
search-# SELECT test_name FROM test WHERE test_name LIKE 'A%';
QUERY PLAN
-------------------------------------------------------------------------------------------
Index Scan using test_name_idx_a on "test" (cost=0.00..8605.88
rows=391208 width=20)
Index Cond: ((test_name >= 'A'::text) AND (test_name < 'B'::text))
Filter: (test_name ~~ 'A%'::text)
(3 rows)

Time: 16.507 ms

-- QUERY 2
search=# explain
search-# SELECT test_name FROM test WHERE test_name LIKE 'A%' AND
test_name LIKE 'AB%';
QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Index Scan using test_name_idx_a on "test" (cost=0.00..113.79
rows=28 width=20)
Index Cond: ((test_name >= 'A'::text) AND (test_name < 'B'::text)
AND (test_name >= 'AB'::text) AND (test_name < 'AC'::text))
Filter: ((test_name ~~ 'A%'::text) AND (test_name ~~ 'AB%'::text))
(3 rows)

Time: 3.197 ms

Ok, now the same two queries after a DROP INDEX test_name_idx_a;

search=# explain
search-# SELECT test_name FROM test WHERE test_name LIKE 'A%';
QUERY PLAN
-----------------------------------------------------------------------------------------------
Index Scan using test_name_unique on "test" (cost=0.00..1568918.66
rows=391208 width=20)
Index Cond: ((test_name >= 'A'::text) AND (test_name < 'B'::text))
Filter: (test_name ~~ 'A%'::text)
(3 rows)

Time: 2.470 ms

search=# explain
search-# SELECT test_name FROM test WHERE test_name LIKE 'AB%';
QUERY PLAN
-------------------------------------------------------------------------------------------
Index Scan using test_name_unique on "test" (cost=0.00..20379.49
rows=5081 width=20)
Index Cond: ((test_name >= 'AB'::text) AND (test_name < 'AC'::text))
Filter: (test_name ~~ 'AB%'::text)
(3 rows)

Time: 2.489 ms

------------------
Copying just the costs you can see the vast difference...
Index Scan using test_name_unique on "test" (cost=0.00..1568918.66
rows=391208 width=20)
Index Scan using test_name_unique on "test" (cost=0.00..20379.49
rows=5081 width=20)

Index Scan using test_name_idx_a on "test" (cost=0.00..8605.88
rows=391208 width=20)
Index Scan using test_name_idx_a on "test" (cost=0.00..113.79
rows=28 width=20)

Lastly no, neither of these row guesstimates is correct.. I'll get
back and tell you how much they're off by if it's important, once this
query is done.

The odd thing is it used the index scan here each time -- that has not
always been the case with the main unique index, it's trying to make a
liar out of me heh.

I'm used to the estimates and plan changing from one vacuum analyze to
the next, even without any inserts or updates between.. the index scan
is always used however when I have the partial indexes in place, and
something like..

CREATE TEMP TABLE t1 AS
SELECT field FROM table
WHERE field LIKE 'A%'
AND field LIKE 'AA%';

runs in 6-8 seconds as well, with a bit under 100k records.

-Allen

In response to

Re: Strange (?) Index behavior? at 2004-11-05 21:08:56 from Tom Lane

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Mark Wong	2004-11-05 23:13:45	ia64 results with dbt2 and 8.0beta4
Previous Message	Merlin Moncure	2004-11-05 21:17:02	Re: postgresql amd-64