From: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
---|---|
To: | Karl Denninger <karl(at)denninger(dot)net>, koichi(dot)szk(at)gmail(dot)com |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Problem with pg_compresslog'd archives |
Date: | 2010-02-07 20:17:57 |
Message-ID: | Pine.LNX.4.64.1002072317140.16860@sn.sai.msu.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Koichi-san,
seems, very similar problem.
Regards,
Oleg
On Sun, 7 Feb 2010, Karl Denninger wrote:
> This may belong in a bug report, but I'll post it here first...
>
> There appears to be a **SERIOUS** problem with using pg_compresslog and
> pg_uncompresslog with Postgresql 8.4.2.
>
> Here's my configuration snippet:
>
> full_page_writes = on # recover from partial page writes
> wal_buffers = 256kB # min 32kB
> # (change requires restart)
> #wal_writer_delay = 200ms # 1-10000 milliseconds
>
> #commit_delay = 0 # range 0-100000, in microseconds
> #commit_siblings = 5 # range 1-1000
>
> # - Checkpoints -
>
> checkpoint_segments = 64 # in logfile segments, min 1,
> 16MB each
> #checkpoint_timeout = 5min # range 30s-1h
> checkpoint_completion_target = 0.9 # checkpoint target duration,
> 0.0 - 1.0
> #checkpoint_warning = 30s # 0 disables
>
> archive_command = 'test ! -f /dbms/pg_archive/%f.bz2 && pg_compresslog
> %p | bzip2 - >/dbms/pg_archive/%f.bz2' #command to use to
> archive a logfile segment
>
> All appears to be fine with the writes, and they are being saved off on
> the nightly backups without incident.
>
> I take a full dump using the instructions in the documentation and make
> sure I copy the proper "must have" file for consistency to be reached.
>
> The problem comes when I try to restore.
>
> recovery_conf contains:
>
> restore_command = '/usr/local/pgsql/recovery.sh %f %p'
>
> And that file contains:
>
>
> #! /bin/sh
>
> infile=$1
> outfile=$2
>
> if test -f /dbms/pg_archive/$infile.bz2
> then
> bunzip2 -c /dbms/pg_archive/$infile.bz2 |
> /usr/local/pgsql/bin/pg_decompresslog - $outfile
> exit 0
> else
> exit 1
> fi
>
> ==============
>
> The problem is that it appears that some of the segments being saved are
> no good! On occasion I get this when trying to restore...
>
> Feb 7 12:43:51 dbms2 postgres[2001]: [210-1] LOG: restored log file
> "00000001000001710000009A" from archive
> Feb 7 12:43:52 dbms2 postgres[2001]: [211-1] LOG: restored log file
> "00000001000001710000009B" from archive
> Feb 7 12:43:52 dbms2 postgres[2001]: [212-1] LOG: restored log file
> "00000001000001710000009C" from archive
> Feb 7 12:43:52 dbms2 postgres[2001]: [213-1] LOG: restored log file
> "00000001000001710000009D" from archive
> Feb 7 12:43:53 dbms2 postgres[2001]: [214-1] LOG: restored log file
> "00000001000001710000009E" from archive
> Feb 7 12:43:53 dbms2 postgres[2001]: [215-1] LOG: restored log file
> "00000001000001710000009F" from archive
> Feb 7 12:43:54 dbms2 postgres[2001]: [216-1] LOG: restored log file
> "0000000100000171000000A0" from archive
> Feb 7 12:43:54 dbms2 postgres[2001]: [217-1] LOG: restored log file
> "0000000100000171000000A1" from archive
> Feb 7 12:43:55 dbms2 postgres[2001]: [218-1] LOG: restored log file
> "0000000100000171000000A2" from archive
> Feb 7 12:43:55 dbms2 postgres[2001]: [219-1] LOG: restored log file
> "0000000100000171000000A3" from archive
> Feb 7 12:43:56 dbms2 postgres[2001]: [220-1] LOG: restored log file
> "0000000100000171000000A4" from archive
> Feb 7 12:43:56 dbms2 postgres[2001]: [221-1] LOG: restored log file
> "0000000100000171000000A5" from archive
> Feb 7 12:43:57 dbms2 postgres[2001]: [222-1] LOG: restored log file
> "0000000100000171000000A6" from archive
> Feb 7 12:43:57 dbms2 postgres[2001]: [223-1] PANIC: corrupted page
> pointers: lower = 772, upper = 616, special = 0
> Feb 7 12:43:57 dbms2 postgres[2001]: [223-2] CONTEXT: xlog redo
> hot_update: rel 1663/616245/1193269; tid 53/93; new 53/4
> Feb 7 12:43:57 dbms2 postgres[2000]: [1-1] LOG: startup process (PID
> 2001) was terminated by signal 6: Abort trap
> Feb 7 12:43:57 dbms2 postgres[2000]: [2-1] LOG: terminating any other
> active server processes
>
> Eek.
>
> I assume this means that either A6 or A7 is corrupt. But I have the
> file both in the restore AND ON THE MACHINE WHERE IT ORIGINATED:
>
> On the SOURCE machine (which is running just fine):
> tickerforum# cksum *171*A[67]*
> 172998591 830621 0000000100000171000000A6.bz2
> 1283345296 1541006 0000000100000171000000A7.bz2
>
> And off the BACKUP archive, which is what I'm trying to restore:
>
> # cksum *171*A[67]*
> 172998591 830621 0000000100000171000000A6.bz2
> 1283345296 1541006 0000000100000171000000A7.bz2
>
> Identical, says the checksums.
>
> This is VERY BAD - if pg_compresslog is damaging the files in some
> instances then ANY BACKUP TAKEN USING THEM IS SUSPECT AND MAY NOT
> RESTORE!!!!!!
>
> Needless to say this is a MAJOR problem.
>
> -- Karl Denninger
>
>
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
From | Date | Subject | |
---|---|---|---|
Next Message | John R Pierce | 2010-02-07 20:27:31 | Re: [GENERAL] weird bug in rebuilding RPMs |
Previous Message | Karl Denninger | 2010-02-07 19:54:55 | Problem with pg_compresslog'd archives |