From: | Eduardo Morras <emorrasg(at)yahoo(dot)es> |
---|---|
To: | pgsql-admin(at)postgresql(dot)org |
Subject: | Re: Standby is not removing restored WAL segments |
Date: | 2014-09-09 08:45:24 |
Message-ID: | 20140909104524.9c158f39a3edff9166a376d0@yahoo.es |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
On Fri, 5 Sep 2014 09:33:57 +0200
Alexey Klyukin <alexk(at)hintbits(dot)com> wrote:
> Greetings,
>
> We've got a 9.3.5 DB running in a standby mode for a fairly large DB
> (500GB) with a busy WAL traffic (couple of GBs per hour) and it
> occasionally 'forgets' to remove the segments it restored.
>
> The checkpoint_segments is set to 128, and usually we observe around
> 270 segments accumulated, but at the time it happens our check
> triggers at around 2K segments. The manual checkpoint command takes
> ages to complete there, the fast shutdown is very slow (around 10
> minutes, usually less than 1 minute) and the WAL receiver process is
> also unable to run for some reason.
>
> The only way to make this host delete WAL files is to restart . The
> particularly notable restart point right after the shutdown shows
> quite a number of removed files and buffers written (the shared
> buffers is set to 8GB on this system):
>
> 2014-09-04 14:39:33.376 CEST,,,22354,,537a4553.5752,88217,,2014-05-19
> 19:54:27 CEST,,0,LOG,00000,"restartpoint complete: wrote 332473
> buffers (31.7%); 0 transaction log file(s) added, 1237 removed, 6
> recycled; write=9.745 s, sync=680.314 s, total=694.447 s; sync
> files=499
> , longest=37.774 s, average=1.363 s",,,,,,,,,""
>
> If we leave the host running, this restartpoint never happens.
>
> The only difference I can come up with from the other databases that
> do not show this behavior is that the host is running with
> max_standby_streaming_delay and max_standby_archive_delay set to -1,
> but at the time we observed the problem no queries were running on it
> at all.
>
> The problem occurs rarely, but steadily, around once every 3 months.
> During this time the PostgreSQL has been upgraded from 9.0 to 9.3,
> which did not solve the issue.
>
Perhaps, the delete of wal files occurs before, in filesystem time, the wal file is closed by filesystem, and delete returns "error file still open".
> Any clues on how can we debug and diagnose the problem further to come
> up with a proper bug report, if it is a bug, or are we missing
> something in the configuration that causes this?
>
>
> Regards,
> --
> Alexey Klyukin
>
>
> --
> Sent via pgsql-admin mailing list (pgsql-admin(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin
--- ---
Eduardo Morras <emorrasg(at)yahoo(dot)es>
From | Date | Subject | |
---|---|---|---|
Next Message | Alexey Klyukin | 2014-09-09 14:51:16 | Re: Standby is not removing restored WAL segments |
Previous Message | Jerry Sievers | 2014-09-08 17:06:45 | Re: Standby is not removing restored WAL segments |