Re: How does the query planner make its plan?

From: Reg Me Please <regmeplease(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Cc: Christian Schröder <cs(at)deriva(dot)de>
Subject: Re: How does the query planner make its plan?
Date: 2007-11-06 15:44:34
Message-ID: 200711061644.34621.regmeplease@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

It may depend on the index itself against the locales.

1. define the insex with the text_pattern_ops operato class
2. run vacuum analyze on the table
3. re-run the explain

See chapter 11 (especiallu 11.8) for v8.2

Il Tuesday 06 November 2007 16:25:09 Christian Schröder ha scritto:
> Hi list,
> once again I do not understand how the query planner works and why it
> apparently does not find the best result.
> I have a table with about 125 million rows. There is a char(5) column
> with a (non-unique) index. When I try to find the distinct values in
> this column using the following sql statement:
>
> select distinct exchange from foo
>
> the query planner chooses not to use the index, but performs a
> sequential scan. When I disfavour the use of sequential scans ("set
> enable_seqscan = off") the performance is more than 6 times better. Why
> does the query planner's plan go wrong? The table has been vacuum
> analyzed just before I ran the queries.
>
> Here is the plan when I let the query planner alone:
>
> QUERY PLAN
> ---------------------------------------------------------------------------
>----- Unique (cost=23057876.40..23683350.48 rows=4 width=9)
> -> Sort (cost=23057876.40..23370613.44 rows=125094816 width=9)
> Sort Key: exchange
> -> Seq Scan on quotes (cost=0.00..3301683.16 rows=125094816
> width=9)
> (4 rows)
>
> This is what really happens:
>
> QUERY PLAN
> ---------------------------------------------------------------------------
>------------------------------------------------------------ Unique
> (cost=23057876.40..23683350.48 rows=4 width=9) (actual
> time=1577159.744..1968911.024 rows=4 loops=1)
> -> Sort (cost=23057876.40..23370613.44 rows=125094816 width=9)
> (actual time=1577159.742..1927400.118 rows=125094818 loops=1)
> Sort Key: exchange
> -> Seq Scan on quotes (cost=0.00..3301683.16 rows=125094816
> width=9) (actual time=0.022..169744.162 rows=125094818 loops=1)
> Total runtime: 1969844.753 ms
> (5 rows)
>
> With "enable_seqscan = off" I get this plan:
>
> QUERY PLAN
> ---------------------------------------------------------------------------
>---------------------------- Unique (cost=0.00..89811549.81 rows=4 width=9)
> -> Index Scan using quotes_exchange_key on quotes
> (cost=0.00..89498812.77 rows=125094816 width=9)
> (2 rows)
>
> And again with execution times:
>
>
> QUERY PLAN
> ---------------------------------------------------------------------------
>----------------------------------------------------------------------------
>------- Unique (cost=0.00..89811549.81 rows=4 width=9) (actual
> time=0.079..313068.922 rows=4 loops=1)
> -> Index Scan using quotes_exchange_key on quotes
> (cost=0.00..89498812.77 rows=125094816 width=9) (actual
> time=0.078..273787.493 rows=125094818 loops=1)
> Total runtime: 313068.967 ms
> (3 rows)
>
> I understand that from looking at the estimations (89811549.81 with
> index scan vs. 23683350.48 with sequential scan) the query planner had
> to choose the sequential scan. So maybe I have to tune the planner cost
> constants? Indeed I did some changes to these values, but in my opinion
> this should make index scans preferable:
>
> #seq_page_cost = 1.0 # measured on an arbitrary scale
> #random_page_cost = 4.0 # same scale as above
> random_page_cost = 1.0
> #cpu_tuple_cost = 0.01 # same scale as above
> #cpu_index_tuple_cost = 0.005 # same scale as above
> cpu_index_tuple_cost = 0.001
> #cpu_operator_cost = 0.0025 # same scale as above
> #effective_cache_size = 128MB
> effective_cache_size = 4GB
>
> The machine is a dedicated database server with two dual-core xeon
> processors and 8 GB memory.
>
> Thanks for your help,
> Christian

--
Reg me Please

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2007-11-06 15:47:43 Re: (Never?) Kill Postmaster?
Previous Message John Smith 2007-11-06 15:40:27 external editor for psql