From: | Zach Calvert <zachcalvert(at)hemerasoftware(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Sean Davis <sdavis2(at)mail(dot)nih(dot)gov>, Luiz Eduardo Cantanhede Neri <lecneri(at)gmail(dot)com>, pgsql-novice(at)postgresql(dot)org, Dan Scott <balancedtree(at)gmail(dot)com> |
Subject: | Re: Query Optimization |
Date: | 2009-05-27 16:02:46 |
Message-ID: | 89af81ab0905270902r29c8e2s19f07a9626cb4cc7@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-novice |
I've added the leaderboardid, score index and it still does the
sequential scan. I'm going to try inserting a few hundred thousand
rows and seeing if it switches to index scan.
On Wed, May 27, 2009 at 10:34 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Sean Davis <sdavis2(at)mail(dot)nih(dot)gov> writes:
>>> zachcalvert(at)hemerasoftware(dot)com> wrote:
>>>> I have a query and I have run
>>>> explain analyze
>>>> select count(*)
>>>> from score
>>>> where leaderboardid=35 and score <= 6841 and active
>>>>
>>>> The result is
>>>> "Aggregate (cost=2491.06..2491.07 rows=1 width=0) (actual
>>>> time=38.878..38.878 rows=1 loops=1)"
>>>> " -> Seq Scan on score (cost=0.00..2391.17 rows=39954 width=0)
>>>> (actual time=0.012..30.760 rows=38571 loops=1)"
>>>> " Filter: (active AND (score <= 6841) AND (leaderboardid = 35))"
>>>> "Total runtime: 38.937 ms"
>>>>
>>>> I have an index on score, I have an index on score, leaderboardid, and
>>>> active and still it does a sequential scan.
>
>> Postgresql is aware of the "cost" associated with each query. In the case
>> of a small table with an index that is not very discriminative, it may
>> choose a sequential scan. However, as you add more rows, the index scan may
>> become more effective and may be used instead. One thing to keep in mind is
>> that an index scan is NOT always faster than a sequential scan.
>
> A crude rule of thumb is that you need the query to fetch less than
> ten percent of the rows before a bitmap scan is going to be a win,
> and less than one percent before a plain indexscan is going to be a win.
> (If your database is entirely cached in memory then the crossover
> percentages are higher, and you need to adjust the planner's cost
> parameters so that it gets this right.) It's not clear exactly how
> big this table is, but I'm betting the query is fetching more than
> ten percent of it.
>
> One point worth making is that if this is the typical set of conditions
> in your queries, then the best index would be one on (leaderboardid,
> score) not (score, leaderboardid, active). (I'm betting that the
> condition active = true is so nonselective it's not worth keeping it in
> the index at all.) You want equality conditions on the leading
> column(s) and inequalities on the trailing columns. To see why this is,
> think about the index sort ordering and the portion of the index that
> the query will have to scan. In the latter case the set of index
> entries matching this query is a contiguous group; in the former, not.
>
> Our fine manual has a reasonable amount of detail about proper index
> design:
> http://www.postgresql.org/docs/8.3/static/indexes.html
>
> regards, tom lane
>
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Schnabel | 2009-05-27 19:03:33 | table check constraint |
Previous Message | Thomas Kellerer | 2009-05-27 16:00:39 | Re: Tool for modeling |