From: | "Alex Turner" <armtuk(at)gmail(dot)com> |
---|---|
To: | "Guy Rouillier" <guyr-ml1(at)burntmail(dot)com> |
Cc: | "PostgreSQL Performance" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: High update activity, PostgreSQL vs BigDBMS |
Date: | 2006-12-30 03:22:35 |
Message-ID: | 33c6269f0612291922s2a784533n9e6652af79158da1@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
You should search the archives for Luke Lonegran's posting about how IO in
Postgresql is significantly bottlenecked because it's not async. A 12 disk
array is going to max out Postgresql's max theoretical write capacity to
disk, and therefore BigRDBMS is always going to win in such a config. You
can also look towards Bizgres which allegedly elimates some of these
problems, and is cheaper than most BigRDBMS products.
Alex.
On 12/28/06, Guy Rouillier <guyr-ml1(at)burntmail(dot)com> wrote:
>
> I don't want to violate any license agreement by discussing performance,
> so I'll refer to a large, commercial PostgreSQL-compatible DBMS only as
> BigDBMS here.
>
> I'm trying to convince my employer to replace BigDBMS with PostgreSQL
> for at least some of our Java applications. As a proof of concept, I
> started with a high-volume (but conceptually simple) network data
> collection application. This application collects files of 5-minute
> usage statistics from our network devices, and stores a raw form of
> these stats into one table and a normalized form into a second table.
> We are currently storing about 12 million rows a day in the normalized
> table, and each month we start new tables. For the normalized data, the
> app inserts rows initialized to zero for the entire current day first
> thing in the morning, then throughout the day as stats are received,
> executes updates against existing rows. So the app has very high update
> activity.
>
> In my test environment, I have a dual-x86 Linux platform running the
> application, and an old 4-CPU Sun Enterprise 4500 running BigDBMS and
> PostgreSQL 8.2.0 (only one at a time.) The Sun box has 4 disk arrays
> attached, each with 12 SCSI hard disks (a D1000 and 3 A1000, for those
> familiar with these devices.) The arrays are set up with RAID5. So I'm
> working with a consistent hardware platform for this comparison. I'm
> only processing a small subset of files (144.)
>
> BigDBMS processed this set of data in 20000 seconds, with all foreign
> keys in place. With all foreign keys in place, PG took 54000 seconds to
> complete the same job. I've tried various approaches to autovacuum
> (none, 30-seconds) and it doesn't seem to make much difference. What
> does seem to make a difference is eliminating all the foreign keys; in
> that configuration, PG takes about 30000 seconds. Better, but BigDBMS
> still has it beat significantly.
>
> I've got PG configured so that that the system database is on disk array
> 2, as are the transaction log files. The default table space for the
> test database is disk array 3. I've got all the reference tables (the
> tables to which the foreign keys in the stats tables refer) on this
> array. I also store the stats tables on this array. Finally, I put the
> indexes for the stats tables on disk array 4. I don't use disk array 1
> because I believe it is a software array.
>
> I'm out of ideas how to improve this picture any further. I'd
> appreciate some suggestions. Thanks.
>
> --
> Guy Rouillier
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>
From | Date | Subject | |
---|---|---|---|
Next Message | Guy Rouillier | 2006-12-31 07:26:31 | Re: High update activity, PostgreSQL vs BigDBMS |
Previous Message | Alvaro Herrera | 2006-12-29 22:07:47 | Re: High update activity, PostgreSQL vs BigDBMS |