Re: index problem

From: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
To: CoL <col(at)mportal(dot)hu>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: index problem
Date: 2001-10-17 17:47:45
Message-ID: Pine.BSF.4.21.0110171037560.22689-100000@megazone23.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

> > Hmm, does the explain change if you vacuum analyze the other table
> > (prog_data)? If not, what does explain show if you do a
> > set enable_seqscan='off';
> > before it?

Did you do the vacuum analyze on the other table (prog_data) as well?
It seems to be overestimating the number of joined rows, and I wonder
if it would choose a different plan if it had the correct number.

> The result:
> db=>set enable_seqscan='off';
> db=>explain select distinct prog_id,prog_ftype,prog_fcasthour,prog_date
> from prog_dgy_xy,prog_data where pxygy_pid=prog_id;
> NOTICE: QUERY PLAN:
>
> Unique (cost=7606982.10..7854887.48 rows=2479054 width=32)
> -> Sort (cost=7606982.10..7606982.10 rows=24790538 width=32)
> -> Merge Join (cost=0.00..335621.73 rows=24790538 width=32)
> -> Index Scan using progdgyxy_idx2 on prog_dgy_xy
> (cost=0.00..323297.05 rows=921013 width=4)
> -> Index Scan using prog_data_pkey on prog_data
> (cost=0.00..701.12 rows=8872 width=28)
>
> It "seems" index is used, but the same result :(((, and bigger execution
> time: real 3m41.830s

Well, that means the plan it chose before was better, so enable_seqscan
isn't a win here.

> And why:
> POSTGRES:
> set enable_seqscan ='off'; select count(*) from prog_dgy_xy where
> pxygy_pid<13161;
> count
> --------
> 900029
> real 2m34.340s
> explain:
> Aggregate (cost=327896.89..327896.89 rows=1 width=0)
> -> Index Scan using progdgyxy_idx2 on prog_dgy_xy
> (cost=0.00..325594.54 rows=920940 width=0)

It's estimating the entire table will be seen (or most of it anyway),
so it would choose Seq Scan as faster, but you've basically disallowed
that with the enable_seqscan='off'. Is it faster without the explicit
hint (it probably will be). Index Scans are not always better than
Sequence Scans (especially when traversing most of the table as in the
above) and you don't want to use the enable_* unless it actually
is giving you a performance increase.

Browse pgsql-sql by date

  From Date Subject
Next Message Joel Burton 2001-10-17 17:51:13 Re: Multiple Parameters to an Aggregate Function
Previous Message Nicolas KOWALSKI 2001-10-17 17:24:42 update in rule