From: | Dave Crooke <dcrooke(at)gmail(dot)com> |
---|---|
To: | pgsql-performance <pgsql-performance(at)postgresql(dot)org> |
Subject: | Can lots of small writes badly hamper reads from other tables? |
Date: | 2012-01-24 20:16:19 |
Message-ID: | CALi4UpiWCoaC4OOCKo_s34gket2Bpv_ZVET62N4TuEn33K+TTw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi folks
This could be a sheer volume issue, but I though I would ask the wisdom of
this forum as to next investigative steps.
----
We use PostgreSQL 8.4.4 which is bundled with our application as a VMware
virtual appliance. The bulk of the app's database activity is recording
performance data points which arrive in farily large sustained bursts of
perhaps 10,000 rows a minute at a medium sized customer, each of which are
logically separate items and being committed as individual transactions
(JDBC auto-commit mode). Our offshore QA team was assigned to track an
intermittent issue with speed of some large queries on other tables, and
they believe based on correlation the two activities may be contending.
The large query is coming off of different tables from the ones being
written to ... the raw data goes into a table named by day (partitioning is
all within the app, not PG) e.g. PERF_RAW_2012_01_24 and then there are a
bunch of rollup statements which run hourly to do the aggregations, e.g.
insert into PERF_HOURLY_2012_01_24 select key_columns, avg(data), now()
from perf_raw_2012_01_24 where time_stamp between (now() - interval '1
hour') and now() group by key_columns
The big queries are hitting multiple of the PERF_HOURLY tables and pulling
a few dozen rows from each.
We are using a 64-bit VM with 8 virtual cores and 8GB RAM, of which Java
takes a bit over half, and Linux XXXXX with CentOS 5.x .... PG has 1GB of
buffer cache and reasonable (AFAICT) resource limits for everything else,
which are intended to be workable for a range of client sizes out of the
box. True transactional consistency is disabled for performance reasons,
virtual environments do not take kindly to lots of small writes.
---
Is there any tweaking we should do on the PG settings, or on the pattern in
which the app is writing - we currently use 10 writer threads on the Java
side and they keep PG going pretty good.
I considered bundling the writes into larger transactions, will that really
help much with commit consistency off?
Is there some specific "usual suspect" stuff I should look at on the PG
side to look for efficiency issues such as index lock contention or a poor
buffer cache hit ratio? Will doing EXPLAIN ANALYSE on the big query be
informative, and if so, does it need to be done while the write load is
applied?
The other whacky idea I had was to have the writer threads pause or
throttle themselves when a big query is happening (it's all in one JVM and
we are using a connection pooler, so it's easy to intercept and track if
needed) however that strikes me as a rather ugly hack and I'd prefer to do
something more robust and based on config tweaks that leverage existing
resource management in PG.
Relevant schema and config attached, all comments and advice welcome,
including general tuning tips and rationale for moving to PG 9.x .... I'm
well aware this isn't the acme of PG tuning :)
Cheers
Dave
Attachment | Content-Type | Size |
---|---|---|
postgresql.conf | application/octet-stream | 16.7 KB |
schema.log | text/x-log | 1.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tony Capobianco | 2012-01-24 20:41:40 | Cursor fetch performance issue |
Previous Message | Tomas Vondra | 2012-01-23 17:39:14 | Re: spikes in pgbench read-only results |