Re: Avoiding seq scan over 3.2 millions rows

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Andrus" <kobruleht2(at)hot(dot)ee>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Avoiding seq scan over 3.2 millions rows
Date: 2008-11-12 02:29:00
Message-ID: 21758.1226456940@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

"Andrus" <kobruleht2(at)hot(dot)ee> writes:
> explain analyze SELECT sum(xxx)
> FROM dok JOIN rid USING (dokumnr)
> WHERE dok.kuupaev BETWEEN '2008-04-01' AND '2008-04-30'

> "Hash Join (cost=29584.84..308259.32 rows=142686 width=0) (actual
> time=68510.748..96932.174 rows=117883 loops=1)"
> " Hash Cond: ("outer".dokumnr = "inner".dokumnr)"
> " -> Seq Scan on rid (cost=0.00..195599.68 rows=3217368 width=4) (actual
> time=17.130..56572.857 rows=3247363 loops=1)"
> " -> Hash (cost=29243.76..29243.76 rows=53231 width=4) (actual
> time=15878.782..15878.782 rows=44685 loops=1)"
> " -> Index Scan using dok_kuupaev_idx on dok (cost=0.00..29243.76
> rows=53231 width=4) (actual time=47.102..15651.059 rows=44685 loops=1)"
> " Index Cond: ((kuupaev >= '2008-04-01'::date) AND (kuupaev <=
> '2008-04-30'::date))"
> "Total runtime: 97364.282 ms"

> Query performs seq scan over 3.2 million of rows.

There isn't anything particularly wrong with that plan. The alternative
that you seem to be wishing for would involve ~50000 index probes into
"rid", which is hardly going to be free.

You could try reducing random_page_cost to push the planner in the
direction of preferring the indexscan solution, but whether this is
actually better in your situation remains to be seen.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2008-11-12 02:31:04 Re: Problem using COPY command to load data
Previous Message Glen Beane 2008-11-12 02:20:09 Re: Problem using COPY command to load data