Calculating how much redo log space has been used

From: Reza Taheri <rtaheri(at)vmware(dot)com>
To: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Calculating how much redo log space has been used
Date: 2018-08-14 18:51:34
Message-ID: 32CF54B1-DEAA-46D9-9B65-05D75B591D60@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello PostgreSQL community,
I am helping with a benchmarking exercise using PGSQL (I chair the TPC subcommittee<http://www.tpc.org/tpcx-v/default.asp> that has released a benchmark using PGSQL). A requirement of the benchmark is having enough log space allocated for 8 hours of running without needing to archive, back up, etc. I am trying to a) figure out how I can establish the exact space usage for the auditor; and b) how I can reduce the log space usage. Looking at iostat and pgstatspack, it looks like we will need to allocate something like 1.5TB of log space for a 5TB database, which is a huge ratio. (Yes, in the real world, we’d probably archive or ship the logs; but for benchmarking, that doesn’t work)

pgstatspack gives me something like below:

background writer stats

checkpoints_timed | checkpoints_req | buffers_checkpoint | buffers_clean | maxwritten_clean | buffers_backend | buffers_alloc

-------------------+-----------------+--------------------+---------------+------------------+-----------------+---------------

22 | 0 | 6416768 | 2252636 | 0 | 280211 | 9786558

(1 row)

background writer relative stats

checkpoints_timed | minutes_between_checkpoint | buffers_checkpoint | buffers_clean | buffers_backend | total_writes | avg_checkpoint_write

-------------------+----------------------------+--------------------+---------------+-----------------+--------------+----------------------

100% | 6 | 71% | 25% | 3% | 8.659 MB/s | 2278.000 MB

I can calculate how many checkpoint segments I have used from the MB/s. But is there a more direct way of seeing how/when a checkpoint segment is filled up and we move on to the next one?

Also, it looks like the full_page_writes parameter is the only thing that can help reduce the log usage size, but that I have to set it to 1 to avoid corruption after a system crash, which is a requirement. Another requirement is a very short, 6-minute checkpoint time, which means we will likely write the full page very often. Yes, my hands are tied!

Here are the relevant non-default settings:

shared_buffers = 18000MB # min 128kB

temp_buffers = 2MB # min 800kB

maintenance_work_mem = 5MB # min 1MB

bgwriter_delay = 10ms # 10-10000ms between rounds

bgwriter_lru_maxpages = 200 # 0-1000 max buffers written/round

effective_io_concurrency = 10 # 1-1000; 0 disables prefetching

wal_sync_method = open_datasync # the default is the first option

wal_buffers = 16MB # min 32kB, -1 sets based on shared_buffers

wal_writer_delay = 10ms # 1-10000 milliseconds

checkpoint_segments = 750 # in logfile segments, min 1, 16MB each

checkpoint_timeout = 6min # range 30s-1h

checkpoint_completion_target = 0.9 # checkpoint target duration, 0.0 - 1.0

effective_cache_size = 512MB

default_statistics_target = 10000 # range 1-10000

log_destination = 'stderr' # Valid values are combinations of

logging_collector = on # Enable capturing of stderr and csvlog

log_directory = 'pg_log' # directory where log files are written,

log_filename = 'postgresql-%a.log' # log file name pattern,

log_truncate_on_rotation = on # If on, an existing log file with the

log_rotation_age = 1d # Automatic rotation of logfiles will

log_rotation_size = 0 # Automatic rotation of logfiles will

log_checkpoints = on

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Andres Freund 2018-08-14 19:31:18 Re: Calculating how much redo log space has been used
Previous Message Andres Freund 2018-08-14 17:50:02 Re: Bi-modal streaming replication throughput