We recently upgraded the databases for our circuit court applications
from PostgreSQL 8.2.5 to 8.3.4. The application software didn't
change. Most software runs fine, and our benchmarks prior to the
update tended to show a measurable, if not dramatic, performance
improvement overall. We have found one area where jobs are running
much longer and having a greater impact on concurrent jobs -- those
where the programmer creates and drops many temporary tables
(thousands) within a database transaction. We are looking to mitigate
the problem while we look into rewriting queries where the temporary
table usage isn't really needed (probably most of them, but the
rewrites are not trivial).
I'm trying to quantify the issue, and would appreciate any
suggestions, either for mitigation or collecting useful data to find
the cause of the performance regression. I create a script which
brackets 1000 lines like the following within a single begin/commit:
create temporary table tt (c1 int not null primary key, c2 text, c3
text); drop table tt;
I run this repeatedly, to get a "steady state", with just enough
settling time between runs to spot the boundaries of the runs in the
vmstat 1 output (5 to 20 seconds between runs). I'm surprised at how
much disk output there is for this, in either version. In 8.2.5 a
typical run has about 156,000 disk writes in the vmstat output, while
8.3.4 has about 172,000 writes.
During the main part of the run 8.2.5 ranges between 0 and 15 percent
of cpu time in I/O wait, averaging around 10%; while 8.3.4 ranges
between 15 and 25 percent of cpu time in I/O wait, averaging around
18%, with occasional outliers on both sides, down to 5% and up to 55%.
For both, there's a period of time at the end of the transaction
where the COMMIT seems to be doing disk output without any disk wait,
suggesting that the BBU RAID controller is either able to write these
faster because there are multiple updates to the same sectors which
get combined, or that they can be written sequentially.
The time required for psql to run the script varies little in 8.2.5 --
from 4m43.843s to 4m49.388s. Under 8.3.4 this bounces around from run
to run -- from 1m28.270s to 5m39.327s.
I can't help wondering why creating and dropping a temporary table
requires over 150 disk writes. I also wonder whether there is
something in 8.3.4 which directly causes more writes, or whether it is
the result of the new checkpoint and background writer hitting some
pessimal usage pattern where "just in time" writes become "just too
late" to be efficient.
Most concerning is that the 8.3.4 I/O wait time results in slow
performance for interactive tasks and results in frustrated users
calling the support line complaining of slowness. I can confirm that
queries which normally run in 10 to 20 ms are running for several
seconds in competition with the temporary table creation/drop queries,
which wasn't the case before.
I'm going to get the latest snapshot to see if the issue has changed
for 8.4devel, but I figured I should post the information I have so
far to get feedback.
-Kevin