Re: 7.3.1 New install, large queries are slow

From: "Roman Fail" <rfail(at)posportal(dot)com>
To: "Josh Berkus" <josh(at)agliodbs(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <sszabo(at)megazone23(dot)bigpanda(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: 7.3.1 New install, large queries are slow
Date: 2003-01-16 19:22:03
Message-ID: 9B1C77393DED0D4B9DAA1AA1742942DA0E4C06@pos_pdc.posportal.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> Josh Berkus wrote:
> And MSSQL is returning results in 3 seconds? I find that a bit hard
> to believe, unless this query is called repeatedly and that's the
> figure for the last call, where the records are being cached. I'll
> have to look at your hardware descriptions again.

Hardware-wise, the Postgres server is a hot rod and MSSQL is a basic vanilla server. I changed all the WHERE clauses to radically different values and couldn't get it to take more than 5 seconds on MSSQL. Most of it's cost savings seems to come from some kind of "Table Spool/Lazy Spool" in it's execution plan, which looks to me like it only exists for the life of the query. You can read more about this at:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/optimsql/odp_tun_1_1m7g.asp
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/optimsql/odp_tun_1_7rjg.asp
Maybe there are some good ideas here for Postgres. Unfortunately, the MSSQL Execution Plan is displayed graphically, and I can't figure out a way to get it to text without typing it all. I could do some screen shots if you really want to see it.

> Stephan Szabo wrote:
> I'd assume that tranamount values are fairly randomly distributed
> throughout the table, right? It takes about 5 minutes for the
> system to read the entire table and more for the index scan, so
> you're probably reading most of the table randomly and the index
> as well.
> What values on batchdetail do you use in query where clauses regularly?

Yes, tranamount values are randomly distributed. I don't understand why an index scan would be "random", isn't the whole point of an index to have an ordered reference into the data? batchdetail has 5 columns that can be in the WHERE clause, all of which are indexed. None is more likely than the other to be searched, so a clustered index doesn't make much sense to me. The whole thing needs to be fast.

>> Nope. This was a misimpression caused by batchdetail waiting for a
>> bunch of other processes to complete. Sometimes the parallelizing
>> gives me a wrong impression of what's holding up the query. Sorry if I
>> confused you.
>
>I'm still not sure that it isn't a big part given that the time went down
>by a factor of about 4 when index scans were disabled and a sequence scan
>was done and that a sequence scan over the table with no other tables
>joined looked to take about 5 minutes itself and the difference between
>that seqscan and the big query was only about 20 seconds when
>enable_indexscan was off unless I'm misreading those results.

You are not misreading the results. There was a huge difference. Nobody has ever made note of it, but this still seems very odd to me:
*** 'sar -b' during the query
with index scan: 6,000 block reads/sec
with seq scan: 95,000 block reads/sec


Tom, here is the EXPLAIN for your suggested version of the query with enable_indexscan=on. I performed the same query with enable_indexscan=off and total runtime was *much* better: 296174.60 msec. By the way, thank you for your detailed description of how the JOIN order affects the outcome - I understand much better now.
EXPLAIN ANALYZE
SELECT ss.batchdate, ss.batchdetailid, ss.bankno, ss.trandate, ss.tranamount,
ss.submitinterchange, ss.authamount, ss.authno, ss.cardtypeid, ss.mcccode,
ss.name AS merchantname, ss.cardtype, ss.merchid,
p1.localtaxamount, p1.productidentifier, dr.avsresponse,
cr.checkoutdate, cr.noshowindicator, ck.checkingacctno,
ck.abaroutingno, ck.checkno
FROM
(SELECT b.batchdate, d.batchdetailid, t.bankno, d.trandate, d.tranamount,
d.submitinterchange, d.authamount, d.authno, d.cardtypeid, d.mcccode,
m.name, c.cardtype, m.merchid
FROM tranheader t, batchheader b, merchants m, cardtype c, batchdetail d
WHERE t.tranheaderid=b.tranheaderid
AND m.merchantid=b.merchantid
AND d.batchid=b.batchid
AND c.cardtypeid=d.cardtypeid
AND t.clientid = 6
AND d.tranamount BETWEEN 500.0 AND 700.0
AND b.batchdate > '2002-12-15'
AND m.merchid = '701252267') ss
LEFT JOIN purc1 p1 on p1.batchdetailid=ss.batchdetailid
LEFT JOIN direct dr ON dr.batchdetailid = ss.batchdetailid
LEFT JOIN carrental cr ON cr.batchdetailid = ss.batchdetailid
LEFT JOIN checks ck ON ck.batchdetailid = ss.batchdetailid
ORDER BY ss.batchdate DESC
LIMIT 50
Limit (cost=1601637.75..1601637.75 rows=1 width=285) (actual time=1221606.41..1221606.42 rows=5 loops=1)
-> Sort (cost=1601637.75..1601637.75 rows=1 width=285) (actual time=1221606.40..1221606.41 rows=5 loops=1)
Sort Key: b.batchdate
-> Nested Loop (cost=1543595.18..1601637.74 rows=1 width=285) (actual time=1204815.02..1221606.27 rows=5 loops=1)
Join Filter: ("inner".batchdetailid = "outer".batchdetailid)
-> Nested Loop (cost=1543595.18..1601581.23 rows=1 width=247) (actual time=1204792.38..1221560.42 rows=5 loops=1)
Join Filter: ("inner".batchdetailid = "outer".batchdetailid)
-> Nested Loop (cost=1543595.18..1601581.22 rows=1 width=230) (actual time=1204792.35..1221560.27 rows=5 loops=1)
Join Filter: ("inner".batchdetailid = "outer".batchdetailid)
-> Nested Loop (cost=1543595.18..1601581.21 rows=1 width=221) (actual time=1204792.31..1221560.09 rows=5 loops=1)
Join Filter: ("inner".batchdetailid = "outer".batchdetailid)
-> Nested Loop (cost=1543595.18..1545529.17 rows=1 width=202) (actual time=1195376.48..1195578.86 rows=5 loops=1)
Join Filter: ("inner".tranheaderid = "outer".tranheaderid)
-> Nested Loop (cost=1543595.18..1545449.98 rows=1 width=186) (actual time=1195370.72..1195536.53 rows=5 loops=1)
Join Filter: ("inner".cardtypeid = "outer".cardtypeid)
-> Merge Join (cost=1543595.18..1545448.76 rows=1 width=172) (actual time=1195311.88..1195477.32 rows=5 loops=1)
Merge Cond: ("outer".batchid = "inner".batchid)
-> Sort (cost=476.17..476.18 rows=4 width=102) (actual time=30.57..30.59 rows=17 loops=1)
Sort Key: b.batchid
-> Nested Loop (cost=0.00..476.14 rows=4 width=102) (actual time=25.21..30.47 rows=17 loops=1)
-> Index Scan using merchants_ix_merchid_idx on merchants m (cost=0.00..5.65 rows=1 width=78) (actual time=23.81..23.82 rows=1 loops=1)
Index Cond: (merchid = '701252267'::character varying)
-> Index Scan using batchheader_ix_merchantid_idx on batchheader b (cost=0.00..470.30 rows=15 width=24) (actual time=1.38..6.55 rows=17 loops=1)
Index Cond: ("outer".merchantid = b.merchantid)
Filter: (batchdate > '2002-12-15 00:00:00'::timestamp without time zone)
-> Sort (cost=1543119.01..1544045.79 rows=370710 width=70) (actual time=1194260.51..1194892.79 rows=368681 loops=1)
Sort Key: d.batchid
-> Index Scan using batchdetail_ix_tranamount_idx on batchdetail d (cost=0.00..1489103.46 rows=370710 width=70) (actual time=5.26..1186051.44 rows=370307 loops=1)
Index Cond: ((tranamount >= 500.0) AND (tranamount <= 700.0))
-> Seq Scan on cardtype c (cost=0.00..1.10 rows=10 width=14) (actual time=11.77..11.79 rows=10 loops=5)
-> Seq Scan on tranheader t (cost=0.00..55.15 rows=1923 width=16) (actual time=0.02..5.46 rows=1923 loops=5)
Filter: (clientid = 6)
-> Seq Scan on purc1 p1 (cost=0.00..44285.35 rows=941335 width=19) (actual time=10.79..3763.56 rows=938770 loops=5)
-> Seq Scan on direct dr (cost=0.00..0.00 rows=1 width=9) (actual time=0.00..0.00 rows=0 loops=5)
-> Seq Scan on carrental cr (cost=0.00..0.00 rows=1 width=17) (actual time=0.00..0.00 rows=0 loops=5)
-> Seq Scan on checks ck (cost=0.00..40.67 rows=1267 width=38) (actual time=0.77..7.15 rows=1267 loops=5)
Total runtime: 1221645.52 msec


> Tomasz Myrta wrote:
> Are there any where clauses which all of theses variation have?

Yes.....WHERE clientid = ? will appear in every query. The others are present based on user input.


> Ron Johnson wrote:
> What are the indexes on batchdetail?
> There's one on batchid and a seperate one on tranamount?
> If so, what about dropping them and create a single multi-segment
> index on "batchid, tranamount". (A constraint can then enforce
> uniqueness on batchid.

There is no index on batchid, I think it is a good idea to create one. Stephan also suggested this. After I try the single batchid index, I might try to multi-segment index idea as well. I'll post results later today.

> Stephan Szabo wrote:
> Then I realized that batchheader.batchid and
> batchdetail.batchid don't even have the same
> type, and that's probably something else you'd
> need to fix.

Yes, that's a mistake on my part....batchdetail(batchid) should be an int8. It looks to me like converting this datatype can't be done with a single ALTER TABLE ALTER COLUMN statement.....so I guess I'll work around it with an ADD, UPDATE, DROP, and RENAME.

> Josh Berkus wrote:
> Um, Roman, keep in mind this is a mailing list. I'm sure that
> everyone here is happy to give you the tools to figure out how to fix
> things, but only in a DIY fashion, and not on your schedule.

I hate being defensive, but I don't remember saying that I expect anyone to fix my problems for me on my schedule. *I* hope that *I* can get this done by Friday, because otherwise my boss is going to tell me to dump Postgres and install MSSQL on the server. I only mention this fact because it's a blow against PostgreSQL's reputation if I have to give up. There is no pressure on you, and I apologize if something I said sounded like whining.

I am VERY grateful for the time that all of you have given to this problem.

Roman Fail
Sr. Web Application Programmer
POS Portal, Inc.

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Stephan Szabo 2003-01-16 19:35:55 Re: 7.3.1 New install, large queries are slow
Previous Message Stephan Szabo 2003-01-16 18:43:02 Re: 7.3.1 New install, large queries are slow