7.3.1 New install, large queries are slow

From: "Roman Fail" <rfail(at)posportal(dot)com>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: 7.3.1 New install, large queries are slow
Date: 2003-01-15 18:00:04
Message-ID: 9B1C77393DED0D4B9DAA1AA1742942DA0E4BF9@pos_pdc.posportal.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I am trying to get a PostgreSQL server into production (moving from MSSQL2K) but having some serious performance issues. PostgreSQL is new to me, and I'm only just now comfortable with Linux. So far I've succesfully compiled postgres from source and migrated all the data from MSSQL. Postgres is primarily accessed using JDBC.

I really want to use Postgres for production, but if I can't get better results out of it by the end of the week we are dropping it forever and going back to MSSQL despite the $$$. I'm basically at a point where I've got to find help from the list. Please help me make this server fly!

I have a query that does many joins (including two very big tables) which is slow on Postgres. On PGSQL the query takes 19 minutes, but only 3 seconds on MSSQL. The two servers have the same indexes created (including primary key indexes). I finally gave up on creating all the foreign keys in Postgres - after 12 hours of 100% CPU. It's hard for me to believe that the hardware is the bottleneck - the $20k Postgres server far outclasses the MSSQL server (see below for stats). When I ran EXPLAIN ANALYZE for this query the CPU averaged 5%, sar -b shows about 6,000 block reads/sec, and vmstat had zero swapping. EXPLAIN results are below, I'm not sure how to interpret them.

The disks are not reading at max speed during the query - when I ran a VACUUM ANALYZE (after data migration), sar -b was consistently 100,000 block reads/sec. It does not seem like the hardware is holding back things here. I read something about 'fsync' recently, would changing that setting apply in this case?

DATABASE:
'tranheader' table has 2000 tuples, PK index
'batchheader' table has 2.6 million tuples, 5 indexes, FK constraint to tranheader PK
'batchdetail' table has 23 million tuples, 6 indexes, FK constraint to batcheader PK
18 tables with <1000 tuples, most are child tables of batchdetail
All tables have a PK and are normalized
Large nightly INSERTs (~200000 tuples)
Complex SELECTs all day long
No UPDATEs or DELETEs ever, at least until I run low on storage!

I've learned as much as I can absorb from the online docs and archives about performance tuning. Based on my limited understanding, I've changed the following settings. I am totally open to any suggestions, including starting over with RAID, filesystems, PGSQL. I would almost consider FreeBSD if it helped a lot, but that would be a stretch given my time investment in Linux. This is a brand new machine, so bad hardware is a possibility - but I'm not sure how to go about determining that.

*** /etc/sysctl.conf
kernel.shmmni = 4096
kernel.shmall = 32000000
kernel.shmmax = 512000000

*** /usr/local/pgsql/data/postgresql.conf
tcpip_socket=true
shared_buffers = 32768
max_fsm_relations = 10000
max_fsm_pages = 2000000
sort_mem = 8192

POSTGRESQL SYSTEM:
Red Hat Linux 8.0, PostgreSQL 7.3.1 (dedicated, besides SSH daemon)
Dell PE6600 Dual Xeon MP 2.0GHz, 2MB L3 cache,HyperThreading enabled
4.0 GB Physical RAM
/dev/sda1: ext3 101MB /boot
/dev/sda2: ext3 34GB / (sda is 2 disk RAID-1)
none : swap 1.8GB
/dev/sdb1: ext3 104GB /usr/local/pgsql/data (sdb is 6 disk RAID-10)
All 8 drives are 36GB, 15k RPM, Ultra160 SCSI
PERC3/DC 128MB RAID controller

MSSQL SYSTEM:
Dell PE1650, Dual P3 1.1GHz, 1.5GB RAM
Single 18GB, 15k RPM SCSI drive (no RAID)
Windows 2000 Server SP3, SQL Server 2000 SP2

TIA,
Roman Fail
Sr. Web Application Developer
POS Portal, Inc.

EXPLAIN ANALYZE RESULTS:
Limit (cost=370518.31..370518.31 rows=1 width=540) (actual time=1168722.18..1168722.20 rows=5 loops=1)
-> Sort (cost=370518.31..370518.31 rows=1 width=540) (actual time=1168722.18..1168722.18 rows=5 loops=1)
Sort Key: b.batchdate
-> Nested Loop (cost=314181.17..370518.30 rows=1 width=540) (actual time=1148191.12..1168722.09 rows=5 loops=1)
Join Filter: ("inner".batchdetailid = "outer".batchdetailid)
-> Nested Loop (cost=314181.17..370461.79 rows=1 width=502) (actual time=1148167.55..1168671.80 rows=5 loops=1)
Join Filter: ("inner".batchdetailid = "outer".batchdetailid)
-> Nested Loop (cost=314181.17..370429.29 rows=1 width=485) (actual time=1148167.48..1168671.45 rows=5 loops=1)
Join Filter: ("inner".batchdetailid = "outer".batchdetailid)
-> Nested Loop (cost=314181.17..370396.79 rows=1 width=476) (actual time=1148167.41..1168671.08 rows=5 loops=1)
Join Filter: ("inner".batchdetailid = "outer".batchdetailid)
-> Nested Loop (cost=314181.17..314402.47 rows=1 width=457) (actual time=1139099.39..1139320.79 rows=5 loops=1)
Join Filter: ("outer".cardtypeid = "inner".cardtypeid)
-> Merge Join (cost=314181.17..314401.24 rows=1 width=443) (actual time=1138912.13..1139133.00 rows=5 loops=1)
Merge Cond: ("outer".batchid = "inner".batchid)
-> Sort (cost=127418.59..127418.59 rows=3 width=150) (actual time=9681.91..9681.93 rows=17 loops=1)
Sort Key: b.batchid
-> Hash Join (cost=120787.32..127418.56 rows=3 width=150) (actual time=7708.04..9681.83 rows=17 loops=1)
Hash Cond: ("outer".merchantid = "inner".merchantid)
-> Merge Join (cost=120781.58..125994.80 rows=283597 width=72) (actual time=7655.57..9320.49 rows=213387 loops=1)
Merge Cond: ("outer".tranheaderid = "inner".tranheaderid)
-> Index Scan using tranheader_ix_tranheaderid_idx on tranheader t (cost=0.00..121.15 rows=1923 width=16) (actual time=0.15..10.86 rows=1923 loops=1)
Filter: (clientid = 6)
-> Sort (cost=120781.58..121552.88 rows=308520 width=56) (actual time=7611.75..8162.81 rows=329431 loops=1)
Sort Key: b.tranheaderid
-> Seq Scan on batchheader b (cost=0.00..79587.23 rows=308520 width=56) (actual time=0.90..4186.30 rows=329431 loops=1)
Filter: (batchdate > '2002-12-15 00:00:00'::timestamp without time zone)
-> Hash (cost=5.74..5.74 rows=1 width=78) (actual time=31.39..31.39 rows=0 loops=1)
-> Index Scan using merchants_ix_merchid_idx on merchants m (cost=0.00..5.74 rows=1 width=78) (actual time=31.38..31.38 rows=1 loops=1)
Index Cond: (merchid = '701252267'::character varying)
-> Sort (cost=186762.59..186872.62 rows=44010 width=293) (actual time=1127828.96..1128725.39 rows=368681 loops=1)
Sort Key: d.batchid
-> Index Scan using batchdetail_ix_tranamount_idx on batchdetail d (cost=0.00..176768.18 rows=44010 width=293) (actual time=35.48..1104625.54 rows=370307 loops=1)
Index Cond: ((tranamount >= 500.0) AND (tranamount <= 700.0))
-> Seq Scan on cardtype c (cost=0.00..1.10 rows=10 width=14) (actual time=37.44..37.47 rows=10 loops=5)
-> Seq Scan on purc1 p1 (cost=0.00..44259.70 rows=938770 width=19) (actual time=98.09..4187.32 rows=938770 loops=5)
-> Seq Scan on direct dr (cost=0.00..20.00 rows=1000 width=9) (actual time=0.00..0.00 rows=0 loops=5)
-> Seq Scan on carrental cr (cost=0.00..20.00 rows=1000 width=17) (actual time=0.00..0.00 rows=0 loops=5)
-> Seq Scan on checks ck (cost=0.00..40.67 rows=1267 width=38) (actual time=1.03..7.63 rows=1267 loops=5)
Total runtime: 1168881.12 msec

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tomasz Myrta 2003-01-15 19:12:28 Re: complicated queries in pl/pgsql
Previous Message CaptainX0r 2003-01-15 17:22:50 Re: Sun vs. Mac - gprof output