From: | Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com> |
---|---|
To: | depesz(at)depesz(dot)com |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Why so many xlogs? |
Date: | 2010-11-01 18:22:50 |
Message-ID: | AANLkTikTeCqFHgZeUbw15mubA6AyQvigexqacQfy5BBT@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
2010/11/1 hubert depesz lubaczewski <depesz(at)depesz(dot)com>:
> Hi
> have strange situation - too many xlog files.
>
> PostgreSQL 8.3.11 on i386-pc-solaris2.10, compiled by cc -Xa
>
> config:
> # select name, setting from pg_settings where name ~ 'checkpoint|wal' order by 1;
> name | setting
> ------------------------------+---------------
> checkpoint_completion_target | 0.9
> checkpoint_segments | 100
> checkpoint_timeout | 900
> checkpoint_warning | 30
> log_checkpoints | on
> wal_buffers | 2048
> wal_sync_method | open_datasync
> wal_writer_delay | 200
> (8 rows)
>
> as I understand, max number of xlog files in pg_xlog should be ( 1 + 2 *
> checkpoint_segments ).
(2 + checkpoint_completion_target) * checkpoint_segments + 1
=> 291
> And in our case - it's more.
>
> Added cronjob to log data about number of segments, current segment
> name, number of segments in pg_xlog that are before current, and after
> current. script is:
>
> ------------------------------------------------------------
> #!/usr/bin/bash
> LOGFILE=/home/postgres/logs/check_pg_xlog.out
>
> LS_OUTPUT=$( ls -l /pgdata/main/pg_xlog | egrep -v "xlogtemp|backup|status|total" | sort -k9 )
> FIRST_SEGMENT_LINE=$( echo "$LS_OUTPUT" | head -1 )
> LAST_SEGMENT_LINE=$( echo "$LS_OUTPUT" | tail -1 )
>
> FIRST_SEGMENT=$( echo "$FIRST_SEGMENT_LINE" | awk '{print $NF}' )
> LAST_SEGMENT=$( echo "$LAST_SEGMENT_LINE" | awk '{print $NF}' )
> FIRST_SEGMENT_NUM=$( echo "$FIRST_SEGMENT" | awk '{print $NF}' | cut -b 9-16,23-24 )
> LAST_SEGMENT_NUM=$( echo "$LAST_SEGMENT" | awk '{print $NF}' | cut -b 9-16,23-24 )
>
> SEGMENT_COUNT=$( printf $'ibase=16\n1 + %s - %s\n' $LAST_SEGMENT_NUM $FIRST_SEGMENT_NUM | bc )
> CURRENT_WAL_FILE=$( /opt/pgsql8311/bin/psql -U postgres -qAtX -c 'select file_name from pg_xlogfile_name_offset( pg_current_xlog_location())' )
> CURRENT_WAL_FILE_NUM=$( echo "$CURRENT_WAL_FILE" | cut -b 9-16,23-24 )
>
> SEGMENTS_BEFORE_CURRENT=$( printf $'ibase=16\n%s - %s\n' $CURRENT_WAL_FILE_NUM $FIRST_SEGMENT_NUM | bc )
> SEGMENTS_AFTER_CURRENT=$( printf $'ibase=16\n%s - %s\n' $LAST_SEGMENT_NUM $CURRENT_WAL_FILE_NUM | bc )
>
> CURRENT_SEGMENT_LINE=$( echo "$LS_OUTPUT" | grep "$CURRENT_WAL_FILE" )
> (
> date
> printf $'First segment : %s\n' "$FIRST_SEGMENT_LINE"
> printf $'Current segment : %s\n' "$CURRENT_SEGMENT_LINE"
> printf $'Last segment : %s\n' "$LAST_SEGMENT_LINE"
> printf $'Segment count : %s\n' "$SEGMENT_COUNT"
> printf $'Current wal segment : %s\n' "$CURRENT_WAL_FILE"
> printf $'Segments before current : %s\n' "$SEGMENTS_BEFORE_CURRENT"
> printf $'Segments after current : %s\n' "$SEGMENTS_AFTER_CURRENT"
> printf $'Last checkpoint time : %s\n' "$( /opt/pgsql8311/bin/pg_controldata /pgdata/main | egrep '^Time of latest checkpoint:' | sed 's/^[^:]*: *//' )"
> /opt/pgsql8311/bin/psql -U postgres -c "select name, setting from pg_settings where name = any('{checkpoint_timeout,checkpoint_segments,archive_mode,archive_command}')"
> ) >> $LOGFILE
> ------------------------------------------------------------
>
>
> sample output looks like this:
>
> | Mon Nov 1 13:46:00 EDT 2010
> | First segment : -rw------- 1 postgres postgres 16777216 Nov 1 13:16 000000010000376700000053
> | Current segment : -rw------- 1 postgres postgres 16777216 Nov 1 13:45 000000010000376700000064
> | Last segment : -rw------- 1 postgres postgres 16777216 Nov 1 13:01 000000010000376800000029
> | Segment count : 215
> | Current wal segment : 000000010000376700000064
> | Segments before current : 17
> | Segments after current : 197
> | Last checkpoint time : Mon Nov 01 13:31:29 2010
> | name | setting
> | ---------------------+---------------
> | archive_command | /usr/bin/true
> | archive_mode | on
> | checkpoint_segments | 100
> | checkpoint_timeout | 900
> | (4 rows)
>
> As you can see, now we have 215 segments, with 17 that represent wal before current location and 197 that are after current segment!
>
> Here - you can see graph which plots number of wal segments in the last week http://depesz.com/various/bad-wal.jpg
>
> it virtually never goes below 215, and it spikes to 270-300.
>
> In here: http://www.depesz.com/various/bad-wal.log.gz is log from my test script since 20th of october.
>
> Any ideas why number of segments is higher than expected?
>
> Just so that I am clear: I do not want to lower it by changing
> checkpoint_segments. I'm looking for information/enlightenment about why
> it works the way it works, and what could be possibly wrong.
>
> Best regards,
>
> depesz
>
> --
> Linkedin: http://www.linkedin.com/in/depesz / blog: http://www.depesz.com/
> jid/gtalk: depesz(at)depesz(dot)com / aim:depeszhdl / skype:depesz_hdl / gg:6749007
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
--
Cédric Villemain 2ndQuadrant
http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support
From | Date | Subject | |
---|---|---|---|
Next Message | Vick Khera | 2010-11-01 18:23:45 | Re: avoiding nested loops when joining on partitioned tables |
Previous Message | Carlos Mennens | 2010-11-01 18:20:41 | Re: 8.4 Data Not Compatible with 9.0.1 Upgrade? |