Continual Postgres headaches...

From: "Weber, Geoffrey M(dot)" <Geoffrey(dot)Weber(at)mcleodusa(dot)com>
To: "'pgsql-general(at)postgresql(dot)org'" <pgsql-general(at)postgresql(dot)org>
Subject: Continual Postgres headaches...
Date: 2007-12-06 19:22:30
Message-ID: 6C68527B1D25584492BB30AF713C277E2B494D@iacedexch04.mcld.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I've been trying for quite a while to get Postgresql tuned for use as an
OLTP system. I have several PL/pgSQL functions that handle inserts and
updates to the main table and several near-real-time daemons written that
access the data and can take automated actions on it (email/page concerned
people, get complimentary information from a different system, etc.). I
started with Postgres 8.1 and am now using 8.2.4 (and have been since its
release). I'll try to provide enough information for a decent response, but
as I can't obviously put my entire schema and database out there, I'm hoping
that I can get some decent guidelines beyond that what I've found though
Google, etc. to get this thing tuned better.

Most of the data centers in on a central table and has 23 columns, 1
constraint, and 9 indexes. 4 of the indexes are partial. The table usually
contains about 3-4 million rows, but I've cut it down to 1.2 million (cut
out 2/3 of the data) in an effort to migrate the database to a 2nd sever for
more testing. The two partial indexes used the most: 242MB accessed nearly
constantly, and 15MB accessed every 5 seconds - but also updated constantly
via inserts using the 242MB index. Other than one other 25MB index, the
others seem to average around 300MB each, but these aren't used quite as
often (usually about every minute or so).

My problems really are with performance consistency. I have tweaked the
execution so that everything should run with sub-second execution times, but
even after everything is running well, I can get at most a week or two of
steady running before things start to degrade.

In my efforts to correct these consistency in execution problems, I have
gone from vacuuming (with analyze) twice a day to every 30 minutes (how long
it takes a vacuum analyze to run - another seeming problem because it
shouldn't take so long?). I've done a VACUUM FULL and that sometimes helps,
but obviously is not a long-term solution for a true OLTP system. Why I
said "sometimes helps": earlier this week I did a VACUUM FULL, VACUUM
ANALYZE on the entire schema. A query that was supposed to use the 25MB
index above (that matched the parameters of the partial index exactly) was
still not doing so. I had to DROP and re-CREATE the index (and do another
ANALYZE) to get the database to (correctly) use the index.

Another problem: sometimes I get these unexplainable delays in using a
'stored procedure' even though all its SQL calls run just fine individually.
For example, calling a particular function will take 1 minute to execute
even though entering the SQL commands individually through psql will total
up to about 1 second. When I log 'long-duration queries', I only see the
top function call in the log. When I put 'RAISE NOTICE' statements in the
various PL/pgSQL functions, I only see them displayed for the function I
call directly; any underlying functions called from that function does not
show any of their NOTICE statements. Because of this I can't track down
where the delays are occurring for me to try and correct the problem (not
that there should be one in the first place)! I should also note that there
is not any lock contention (double-checked with pgadmin3) showing that would
"artificially" delay the queries...

I've played with quite a few of the parameters like vacuum_cost_delay,
work_mem, shared_buffers, turned fsync off, etc. in the postgresql.conf
file. The server I've been given to use is quite old (read: not the fastest
anymore), but normally shows a load average of 0.25.to 0.50 or so.

I guess my questions boil down to what I can do other than use the EXPLAIN,
EXPLAIN ANALYZE, VACUUM ANALYZE, etc. to get my schema performing acceptably
across the board in a consistent manner. If there is more information that
I can give out that would improve responses, please let me know.

Thanks in advance for any help you send my way!

NOTICE: This electronic mail transmission may contain confidential
information and is intended only for the person(s) named. Any use, copying
or disclosure by any other person is strictly prohibited. If you have
received this transmission in error, please notify the sender via e-mail.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2007-12-06 19:26:28 Re: Nested loop in simple query taking long time
Previous Message Henrik 2007-12-06 19:11:34 Re: Nested loop in simple query taking long time