>>> On Tue, Aug 21, 2007 at 2:36 AM, in message
<Pine(dot)GSO(dot)4(dot)64(dot)0708210326370(dot)2093(at)westnet(dot)com>, Greg Smith
<gsmith(at)gregsmith(dot)com> wrote:
> On Mon, 20 Aug 2007, Kevin Grittner wrote:
>
>> The problem went away completely when we used a very aggressive
>> background writer configuration, to put the dirty pages in front of the
>> OS file system right away, so that its algorithms and the controller
>> cache could deal with things before they got out of hand.
>
> Can you share what the aggressive BGW settings you settled on were?
Under 8.2.4, the postgresql.conf file, minus blank lines and comments:
listen_addresses = '*'
max_connections = 200
shared_buffers = 160MB
temp_buffers = 50MB
work_mem = 10MB
maintenance_work_mem = 160MB
max_fsm_pages = 800000
bgwriter_lru_percent = 20.0
bgwriter_lru_maxpages = 200
bgwriter_all_percent = 10.0
bgwriter_all_maxpages = 600
wal_buffers = 160kB
checkpoint_segments = 10
random_page_cost = 2.0
effective_cache_size = 5GB
redirect_stderr = on
log_line_prefix = '[%m] %p %q<%u %d %r> '
stats_block_level = on
stats_row_level = on
autovacuum = on
autovacuum_naptime = 10s
autovacuum_vacuum_threshold = 1
autovacuum_analyze_threshold = 1
datestyle = 'iso, mdy'
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'
escape_string_warning = off
standard_conforming_strings = on
sql_inheritance = off
> Also be helpful to know about how often you have a
> checkpoint with your current configuration
Since our problems associated with checkpoints have been eliminated I
haven't been tracking them very closely. A scan of our log files on this
machine show that there were two episodes in the last two months where we
had 20 to 40 checkpoints in a day which were less than 30 seconds after the
prior one. These were associated with times when this central copy had
fallen significantly behind the source databases and replication was running
full out, catching up with the sources.
Outside of those events we seem to have on to four episodes on scattered
days. A quick check shows that all but a few coincide with a weekly run
which passes massive amounts of data to rebuild a table of cases which
should not show on the public web site according to the Supreme Court's
records retention rules.
Is there anything you would like me to do to capture more information
about the checkpoints?
-Kevin