Re: Select time jump after adding filter; please help me figure out what I'm doing wrong.

From: Andrew Edson <cheighlund(at)yahoo(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Select time jump after adding filter; please help me figure out what I'm doing wrong.
Date: 2007-08-15 15:53:06
Message-ID: 527382.96409.qm@web34210.mail.mud.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I apologize about the CC; I thought I had done so.

There are fourteen (14) distinct values in rcrd_cd. And I don't know if this counts as something odd, but I got the following values by doing a vacuum full analyze, then running the set with index, dropping index, running set without.

Values with index:
>
> attest=# select count(*) from ptrans;
> 16488578
>
> attest=# select count(*) from ptrans where rcrd_cd = '0A';
> 6701655
>
> attest=# explain analyze select cntrct_id, tran_dt from ptrans where rcrd_cd = ' > 0A';
> Bitmap Heap Scan on ptrans (cost=1223.86..151183.39 rows=87439 width=21) (actu > al time=2255.640..70371.304 rows=6701655 loops=1)
> -> Bitmap Index Scan on ptrans_cid_trandt_idx (cost=0.00..1223.86 rows=8743 > 9 width=0) (actual time=2216.856..2216.856 rows=204855 loops=1)
> Total runtime: 89964.419 ms
>
Values without index:
>
> attest=# select count(*) from ptrans;
> 16488578
>
> attest=# select count(*) from ptrans where rcrd_cd = '0A';
> 204855
>
> attest=# explain analyze select cntrct_id, tran_dt from ptrans where rcrd_cd = '0A';
> Seq Scan on ptrans (cost=0.00..384813.22 rows=87439 width=21) (actual time=20.286..65330.049 rows=204855 loops=1)
> Filter: (rcrd_cd = '0A'::bpchar)
> Total runtime: 65945.160 ms

Again, someone mind pointing out to me where I've managed to mess this one up?

Richard Huxton <dev(at)archonet(dot)com> wrote:
Don't forget to CC: the list.

Andrew Edson wrote:
> I apologize; You are correct in that I mistyped my original structure. Here is the information for the correct explain and explain analyze statements.
>
> attest=# explain select cntrct_id, tran_dt from ptrans where rcrd_cd = '0A';

No need for the simple explain - explain analyse includes all the
information.

> attest=# explain analyze select cntrct_id, tran_dt from ptrans where rcrd_cd = '0A';
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------------------------
> Bitmap Heap Scan on ptrans (cost=1223.86..149853.23 rows=85741 width=21) (actual time=2302.363..70321.838 rows=6701655 loops=1)
> -> Bitmap Index Scan on ptrans_cid_trandt_idx (cost=0.00..1223.86 rows=85741 width=0) (actual time=2269.064..2269.064 rows=204855 loops=1)
> Total runtime: 89854.843 ms

Well, it's taking 90 seconds to return 6.7 million rows. Depending on
your system and memory settings, that might not be unreasonable.

It *is* getting the estimate of returned rows wrong (it thinks 85,741
will match) which is hugely out of line. Is there something odd with
this table/column or haven't you analysed recently? How many unique
values does rcrd_cd have, and how many rows does the table have?

I don't know that you'll get this down to sub-second responses though,
not if you're trying to return 6 million rows from an even larger table.

--
Richard Huxton
Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend


---------------------------------
Ready for the edge of your seat? Check out tonight's top picks on Yahoo! TV.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Phoenix Kiula 2007-08-15 15:53:37 Re: Customizing psql console to show execution times
Previous Message Decibel! 2007-08-15 15:52:27 Re: Interpreting statistics collector output