From: | "Scott Marlowe" <smarlowe(at)qwest(dot)net> |
---|---|
To: | "Joshua Marsh" <icub3d(at)gmail(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Large Database Performance suggestions |
Date: | 2004-10-22 05:03:34 |
Message-ID: | 1098421413.21035.78.camel@localhost.localdomain |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Thu, 2004-10-21 at 21:14, Joshua Marsh wrote:
> Hello everyone,
>
> I am currently working on a data project that uses PostgreSQL
> extensively to store, manage and maintain the data. We haven't had
> any problems regarding database size until recently. The three major
> tables we use never get bigger than 10 million records. With this
> size, we can do things like storing the indexes or even the tables in
> memory to allow faster access.
>
> Recently, we have found customers who are wanting to use our service
> with data files between 100 million and 300 million records. At that
> size, each of the three major tables will hold between 150 million and
> 700 million records. At this size, I can't expect it to run queries
> in 10-15 seconds (what we can do with 10 million records), but would
> prefer to keep them all under a minute.
>
> We did some original testing and with a server with 8GB or RAM and
> found we can do operations on data file up to 50 million fairly well,
> but performance drop dramatically after that. Does anyone have any
> suggestions on a good way to improve performance for these extra large
> tables? Things that have come to mind are Replication and Beowulf
> clusters, but from what I have recently studied, these don't do so wel
> with singular processes. We will have parallel process running, but
> it's more important that the speed of each process be faster than
> several parallel processes at once.
I'd assume that what's happening is that up to a certain data set size,
it all fits in memory, and you're going from CPU/memory bandwidth
limited to I/O limited. If this is the case, then a faster storage
subsystem is the only real answer. If the database is mostly read, then
a large RAID5 or RAID 1+0 array should help quite a bit.
You might wanna post some explain analyze of the queries that are going
slower at some point in size, along with schema for those tables etc...
From | Date | Subject | |
---|---|---|---|
Next Message | Dave Cramer | 2004-10-22 11:38:49 | Re: Large Database Performance suggestions |
Previous Message | Tom Lane | 2004-10-22 04:12:44 | Re: mmap (was First set of OSDL Shared Mem scalability results, some wierdness ... |