From: | Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> |
---|---|
To: | Alex Thurlow <alex(at)blastro(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Tuning resource parameters for a logging database. |
Date: | 2009-05-21 19:36:42 |
Message-ID: | dcc563d10905211236l2d71e82ck1cdb8054d061ccd6@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Thu, May 21, 2009 at 1:13 PM, Alex Thurlow <alex(at)blastro(dot)com> wrote:
> I have a postgresql database that I'm using for logging of data. There's
> basically one table where each row is a line from my log files. It's
> getting to a size where it's running very slow though. There are about 10
> million log lines per day and I keep 30 days of data in it. All the columns
> I filter on are indexed (mostly I just use date).
**DING DING DING** you've just said the magic phrase that says that
partitioning would be a help.
> And I tend to pull one
> day of data at a time with grouped counts by 1 or 2 other columns. There
> also tends to be only 1 or 2 of these large queries running at any given
> time, so a lot of resources can be thrown at each one.
>
> I'm wondering what my resource parameters should be for optimal speed of the
> selects on this database, since I haven't seen a good example where someone
> has done anything like this.
With a logging database you're optimizing two often opposing actions.
Lots of small inserts in a stream that HAVE to get processed and put
in efficiently. This is often accomplished with minimum
shared_buffers and work_mem, because there's no need for the overhead
of large shared_buffers and insert queries for logging dbs don't need
much work_mem.
With a reporting database you run queries that chew up tons of memory
both shared_buffers and work_mem for efficient operation.
> The machine is an 8 core opteron (I know I won't really use those, but Dell
> threw in the 2nd proc for free) with 8 Gb RAM. The database is on a RAID 10
> JFS partition.
Yeah CPUs are cheap, might as well stock up on them. A reporting
database can quickly go cpu bound if everything the users want to see
fits in memory.
> This is what I have in postgresql.conf right now..
>
> shared_buffers = 64MB
Small for reporting, just right for logging. I'd try something bigger
but not insanely huge. Let the OS do the caching of 90% of the data,
let the db cache a good sized working set. 256M to 1G is reasonable
based on benchmarks of your own queries.
> work_mem = 128MB
Bigger than needed for logging, good for reporting. You can probably
just leave it.
> maintenance_work_mem = 256MB
> max_fsm_pages = 614400
If you're not partitioning then this needs to be big enough to contain
1 days+ worth of dead rows.
Look at lowering your random_page_cost, and increasing default stats
target to 100 to 1000 depending on your data and explain analyze query
testing.
From | Date | Subject | |
---|---|---|---|
Next Message | Alex Thurlow | 2009-05-21 19:37:00 | Re: Tuning resource parameters for a logging database. |
Previous Message | Vick Khera | 2009-05-21 19:24:53 | Re: Tuning resource parameters for a logging database. |