From: | Mark kirkwood <markir(at)slingshot(dot)co(dot)nz> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | How Postgresql Compares For Query And Load Operations |
Date: | 2001-07-13 10:22:47 |
Message-ID: | 01071322224708.01182@spikey.slithery.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Dear list,
With the advent of Version 7.1.2 I thought it would be interesting to compare
how Postgresql does a certain class of queries (Star Queries), and Data Loads
with some of the other leading databases ( which were in my humble opinion
Mysql 3.23.39, Db2 7.2 ,Oracle 9.0 [ ok - Possibly SQL 2000 too but I dont
run Winanyk] ).
The results were overall very encouraging :
Postgresql can clearly hold its own when compared to the "big guys".
The full details (together with a wee rant) are aviailable on :
http://homepages.slingshot.co.nz/~markir
(if anyone asks I can submit the entire results...but I figured, lets cut to
the chase here....)
There were two areas where Postgresql was slower, and I thought it would be
informative to discuss these briefly :
1 Star query scanning a sigificant portion of a fact table
SELECT
d0.f1,
count(f.val)
FROM dim0 d0,
fact1 f
WHERE d0.d0key = f.d0key
AND d0.f1 between '2007-07-01' AND '2018-07-01'
GROUP BY d0.f1
This query requires summarizing a significant proportion of the 3000000 row (
700Mb ) fact1 table.
Postgres 7.1.2 executed this query like :
Aggregate (cost=2732703.88..2738731.49 rows=120552 width=20)
-> Group (cost=2732703.88..2735717.69 rows=1205521 width=20)
-> Sort (cost=2732703.88..2732703.88 rows=1205521 width=20)
-> Hash Join (cost=1967.52..2550188.93 rows=1205521 width=20)
-> Seq Scan on fact1 f (cost=0.00..1256604.00 rows=3000000
width=8)
-> Hash (cost=1957.47..1957.47 rows=4018 width=12)
-> Index Scan using dim0_q1 on dim0 d0 (cost=0.00..1957.47
rows=4018 width=12)
for an elapsed time of 3m50s
Wheras Oracle 9.0 used :
SELECT STATEMENT Optimizer=CHOOSE (Cost=5810 Card=4020 Bytes =60300)
SORT (GROUP BY) (Cost=5810 Card=4020 Bytes=60300)
HASH JOIN (Cost=5810 Card=2989644 Bytes=44844660)
TABLE ACCESS (FULL) OF 'DIM0' (Cost=4 Card=4020 Bytes= 40200)
TABLE ACCESS (FULL) OF 'FACT1' (Cost=5806 Card=2990089
Bytes=14950445)
for an elapsed time of 50s.
It would seem that Oracle's execution plan is more optimal.
2 Bulk loading data
Buld Load times for a 3000000 row (700Mb ) fact table were
Postgresql 9m30s (copy)
Db2 2m15s (load)
Oracle 5m (sqlldr)
Mysql 2m20s (load)
(Note that Db2 is faster than Mysql here ! )
While I left "fsync = on" for this test... I still think the difference was
worth noting.
Any comments on these sort of results would be welcome.
regards
Mark
From | Date | Subject | |
---|---|---|---|
Next Message | The Hermit Hacker | 2001-07-13 12:11:56 | Re: news server access down? |
Previous Message | Richard Huxton | 2001-07-13 09:17:53 | Re: Foreign keys? |