Re: PostgreSQL 10 not archiving some WAL files

From: Norberto Dellê <betodelle(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: PostgreSQL 10 not archiving some WAL files
Date: 2020-03-19 12:53:56
Message-ID: CAC_JJAuOchGUZ3Q1Znvy4o_JH64HPRq6oRM-g9CYPB-21rDGcQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

>On Wed, Mar 18, 2020 at 10:57:22AM -0300, Norberto Dellê wrote:
> >> This setup worked very well for a long time, but since we upgraded
> >> PostgreSQL to newer versions (10.x), sometimes it just skips archiving
> some
> >> wal files. This seems to happen mainly when the server is shut down.
> >> There's no messages about errors in the logs, and also no corresponding
> >> .done file in archive_status, it just goes to the next wal file.
> >>
> >> I would like to know if any of you had or has a problem like this, and
> if
> >> you had, how you solved it.
> >
> >Hmm. I don't recall seeing any of that. This would mean that we are
> >either missing the generation of some .ready file, or that some .done
> >file gets generated when they should not in archive_status/. What
> >kind of server shutdown are you doing? Immediate so as recovery
> >happens at the follow-up startup. Or is that a clean service
> >shutdown?
> >--
> >Michael
> >...
>
> Hi Michael
>
> Almost daily the server is shutdown, because my client wants to save
> energy.
> In the logs, it appears that the service was not shut down properly.
> Here's the last lines of one log file:
>
> 2020-03-16 18:10:03.130 -04 [1964] LOG: received fast shutdown request
> 2020-03-16 18:10:03.132 -04 [2948] ERROR: canceling statement due to user
> request
> 2020-03-16 18:10:03.192 -04 [1964] LOG: aborting any active transactions
> 2020-03-16 18:10:03.194 -04 [1964] LOG: worker process: logical
> replication launcher (PID 2948) exited with exit code 1
> 2020-03-16 18:10:03.210 -04 [2188] FATAL: terminating connection due to
> administrator command
> 2020-03-16 18:10:03.210 -04 [6688] FATAL: terminating connection due to
> administrator command
> 2020-03-16 18:10:03.210 -04 [3348] FATAL: terminating connection due to
> administrator command
> 2020-03-16 18:10:03.210 -04 [6356] FATAL: terminating connection due to
> administrator command
> 2020-03-16 18:10:03.210 -04 [5736] FATAL: terminating connection due to
> administrator command
> 2020-03-16 18:10:03.210 -04 [4028] FATAL: terminating connection due to
> administrator command
> 2020-03-16 18:10:03.211 -04 [2552] FATAL: terminating connection due to
> administrator command
> 2020-03-16 18:10:03.212 -04 [4280] FATAL: terminating connection due to
> administrator command
> 2020-03-16 18:10:03.210 -04 [2336] FATAL: terminating connection due to
> administrator command
> 2020-03-16 18:10:03.227 -04 [6564] FATAL: terminating connection due to
> administrator command
> 2020-03-16 18:10:03.255 -04 [2884] LOG: shutting down
>
> When the service is started the next day, this is what Postgres logs:
>
> 2020-03-17 07:47:26.721 -04 [2272] LOG: database system was interrupted;
> last known up at 2020-03-16 17:56:20 -04
> 2020-03-17 07:47:34.974 -04 [2884] FATAL: the database system is starting
> up
> 2020-03-17 07:47:39.378 -04 [2272] LOG: database system was not properly
> shut down; automatic recovery in progress
> 2020-03-17 07:47:39.631 -04 [2272] LOG: redo starts at 38/F30F5108
> 2020-03-17 07:47:40.038 -04 [3052] FATAL: the database system is starting
> up
> 2020-03-17 07:47:40.126 -04 [2272] LOG: redo done at 38/F4001738
> 2020-03-17 07:47:40.207 -04 [2272] LOG: last completed transaction was at
> log time 2020-03-16 18:06:53.269798-04
> 2020-03-17 07:47:42.590 -04 [2080] LOG: database system is ready to
> accept connections
>
> In this shutdown/startup process, this PostgreSQL instance skipped
> archiving a wal file.
> Another thing that I perceived today is that in another instance, days
> after (about a week) skipping a wal file, the file was finally archived.
> Both of these instances that I talked about also have streaming
> replication configured. But I don't think it is related because there other
> instances skipping archiving wal files that don't have streaming
> replication configured.
>
> Norberto
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Michael Paquier 2020-03-19 13:09:28 Re: RPMs from postgresql.org break CentOS/RHEL RPMs
Previous Message Peter Krefting 2020-03-19 12:41:55 RPMs from postgresql.org break CentOS/RHEL RPMs