Re: joining two tables slow due to sequential scan

From: "Dave Dutcher" <dave(at)tridecap(dot)com>
To: "'Tim Jones'" <TJones(at)optio(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: joining two tables slow due to sequential scan
Date: 2006-02-10 23:25:20
Message-ID: 002601c62e99$434299b0$8300a8c0@tridecap.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

OK, if I'm reading this correctly, it looks like the planner is choosing
a sequential scan because it expects 48,000 rows for that
patientidentifier, but its actually only getting 3. The planner has the
number of rows right for the sequential scan, so it seems like the stats
are up to date. I would try increasing the stats for the
patientindentifier column with 'alter table set statistics...' or
increasing the default_statistics_target for the whole DB. Once you
have changed the stats I believe you need to run analyze again.

-----Original Message-----
From: pgsql-performance-owner(at)postgresql(dot)org
[mailto:pgsql-performance-owner(at)postgresql(dot)org] On Behalf Of Tim Jones
Sent: Friday, February 10, 2006 4:59 PM
To: Tom Lane
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] joining two tables slow due to sequential scan

ok here is real db

the first query I had seems to make no sense because it is only fast if
I limit the rows since almost all rows have status = 'AC'

second query
tables both have about 10 million rows and it takes a long time as you
can see but this person only has approx 160 total documents

QUERY PLAN
------------------------------------------------------------------------
-------------------------------------------------------------------
Hash Join (cost=84813.14..1510711.97 rows=48387 width=555) (actual
time=83266.854..91166.315 rows=3 loops=1)
Hash Cond: ("outer".documentidentifier =
"inner".dssdocumentidentifier)
-> Seq Scan on documentversions (cost=0.00..269141.98 rows=9677398
width=415) (actual time=0.056..49812.459 rows=9677398 loops=1)
-> Hash (cost=83660.05..83660.05 rows=48036 width=140) (actual
time=10.833..10.833 rows=3 loops=1)
-> Bitmap Heap Scan on clinicaldocuments
(cost=301.13..83660.05 rows=48036 width=140) (actual time=0.243..0.258
rows=3 loops=1)
Recheck Cond: (patientidentifier = 690193)
-> Bitmap Index Scan on ix_cdocpid (cost=0.00..301.13
rows=48036 width=0) (actual time=0.201..0.201 rows=3 loops=1)
Index Cond: (patientidentifier = 690193)
Total runtime: 91166.540 ms

Tim Jones
Healthcare Project Manager
Optio Software, Inc.
(770) 576-3555

-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Friday, February 10, 2006 5:52 PM
To: Tim Jones
Cc: Scott Marlowe; Dave Dutcher; pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] joining two tables slow due to sequential scan

"Tim Jones" <TJones(at)optio(dot)com> writes:
> QUERY PLAN
> 'Hash Join (cost=899.83..4384.17 rows=482 width=1350) (actual
> time=0.203..0.203 rows=0 loops=1)'
> ...
> 'Total runtime: 0.392 ms'

Hardly seems like evidence of a performance problem ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Merlin Moncure 2006-02-11 00:40:10 Re: What do the Windows pg hackers out there like for dev tools?
Previous Message Tim Jones 2006-02-10 22:59:03 Re: joining two tables slow due to sequential scan