From: | Alban Hertroys <dalroi(at)solfertje(dot)student(dot)utwente(dot)nl> |
---|---|
To: | Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, PG-General Mailing List <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: (Questioning the planner's mind) - was Re: Fastest way to drop an index? |
Date: | 2009-02-05 07:45:04 |
Message-ID: | E30A0C62-8D75-4B74-AA00-9644DF7E25C6@solfertje.student.utwente.nl |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Feb 4, 2009, at 5:23 AM, Phoenix Kiula wrote:
> On Wed, Feb 4, 2009 at 5:13 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com> writes:
>>> Index Scan using new_idx_testimonials_userid on testimonials
>>> (cost=0.00..157.78 rows=1 width=9) (actual time=8809.715..8809.715
>>> rows=0 loops=1)
>>> Index Cond: ((user_id)::text = 'superman'::text)
>>> Filter: ((title_encrypted)::text =
>>> 'b333dc1b0992cb8c70b58a418211389a'::text)
>>> Total runtime: 8809.750 ms
>>
>> This is using the index to fetch the rows that match user_id =
>> 'superman',
>> and then testing each fetched row to see if it has the desired
>> value of
>> title_encrypted. The fact that hardly any rows pass the filter test
>> tells nearly nothing about how long this should be expected to run.
>> The rather high estimated cost suggests that the planner thinks there
>> are several dozen rows matching the index condition, and the actual
>> runtime suggests that there are actually hundred or thousands of 'em.
>> If so, your problem is that you need a different index. I'd bet on
>> an
>> index on title_encrypted being more useful for this query than the
>> one
>> on user_id; or you could experiment with a two-column index.
>>
>
> Thanks Tom. My thinking exactly.
>
> So I have made a two column index on (user_id, title_encrypted)
> already. It's done.
>
> But the planner keeps insisting on using the "user_id" as you see from
> that EXPLAIN ANALYZE. This was done when the other two-col index
> already exists!
>
> Could I force the 2-col index? I googled for "force index postgresql"
> and came upon this discussion -
> http://archives.postgresql.org/pgsql-sql/2006-02/msg00190.php - which
> suggests that the planner may be selecting indexes based on "cost'.
>
> I am not too technically savvy, but I think this means that given the
> choice of these two scenarios...
>
> 1. Search through "user_id" index, and then limit it by
> "title_encrypted"
> or
> 2. Search through "user_id, title_encrypted" 2-col index
>
> ...the planner decides that it is less resource intensive to go
> through the somewhat smaller user_id index and then limit it (i.e.,
> scenario 1) than to wade through the bigger second index.
The EXPLAIN ANALYZE of the second form of the query would tell. I
suppose you'd rather not lose your new_idx_testimonials_userid index,
but you can drop that in a transaction and roll back. I don't think
that'd impact other transactions at all, seeing how MVCC just marks it
deleted for your session, but I can't promise you that. Caution is
advised ;)
Try:
BEGIN;
EXPLAIN ANALYZE SELECT * FROM testimonials WHERE userid='superman' AND
title_encrypted='b333dc1b0992cb8c70b58a418211389a';
DROP INDEX new_idx_testimonial_userid;
ANALYZE testimonials;
EXPLAIN ANALYZE SELECT * FROM testimonials WHERE userid='superman' AND
title_encrypted='b333dc1b0992cb8c70b58a418211389a';
ROLLBACK;
> Am I on the right track? If I am, well what's the way around this? How
> can I make the planner make use of the 2-col index? Or if my
> understanding is not right, why is the scenario 1 being chosen to
> begin with?
>
> Thanks for any thoughts! This single query, which used to be much
> faster than this, is now slowing down our operations by about 8 second
> per query!
>
> Let me know if you need to know any pgsql.conf settings. Only index
> related setting I know of are these:
>
> enable_indexscan = on
> enable_bitmapscan = off
> enable_nestloop = on
I think for this case the planner might be trying a bitmap scan and
you upped the costs of that significantly by turning it 'off'. The
earlier query-plan will show. There shouldn't be any need to disable
bitmap scans in the first place, you might want to try that same query
with bitmap scans enabled and see how it performs.
Another possible cause is that both indexes are quite large and they
can't both fit into memory. I suspect the new_idx_tesimonials_userid
index gets used in other queries as well, so it's likely in the cache.
I guess that could make using that index and scan through the results
faster than reading the new index from disk.
Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.
!DSPAM:737,498a9909747034241410875!
From | Date | Subject | |
---|---|---|---|
Next Message | Alban Hertroys | 2009-02-05 07:58:30 | Re: Moving between major versions easily (Was: Vacuums taking forever :() |
Previous Message | Schwaighofer Clemens | 2009-02-05 07:21:05 | Re: running postgres |