Re: URGENT issue: pg-xlog growing on master!

From: Matheus de Oliveira <matioli(dot)matheus(at)gmail(dot)com>
To: Niels Kristian Schjødt <nielskristian(at)autouncle(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org list" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: URGENT issue: pg-xlog growing on master!
Date: 2013-06-10 17:59:18
Message-ID: CAJghg4KktXZipoqF9FCMq7jx56A0bd6pC2Ar6xNOm42FUdw_Jw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Jun 10, 2013 at 12:35 PM, Niels Kristian Schjødt <
nielskristian(at)autouncle(dot)com> wrote:

>
> Den 10/06/2013 kl. 16.36 skrev bricklen <bricklen(at)gmail(dot)com>:
>
> On Mon, Jun 10, 2013 at 4:29 AM, Niels Kristian Schjødt <
> nielskristian(at)autouncle(dot)com> wrote:
>
>>
>> 2013-06-10 11:21:45 GMT FATAL: could not connect to the primary server:
>> could not connect to server: No route to host
>> Is the server running on host "192.168.0.4" and accepting
>> TCP/IP connections on port 5432?
>>
>
> Did anything get changed on the standby or master around the time this
> message started occurring?
> On the master, what do the following show?
> show port;
> show listen_addresses;
>
> The master's IP is still 192.168.0.4?
>
> Have you tried connecting to the master using something like:
> psql -h 192.168.0.4 -p 5432 -U postgres -d postgres
>
> Does that throw a useful error or warning?
>
>
>
> It turned out that the switch port that the server was connected to was
> faulty, and hence no successful connection between master and slave was
> established. This resolved in pg_xlog building up very fast, because our
> system performs a lot of changes on the data we store.
>
> I ended up running pg_archivecleanup on the master to get some space freed
> urgently. Then I got the switch changed with a new one. Now I'm trying to
> the streaming replication setup from scratch again, but with no luck.
>
> I can't seem to figure out which steps I need to do, to get the standby
> server wiped and get it started as a streaming replication again from
> scratch. I tried to follow the steps, from step 6, in here
> http://wiki.postgresql.org/wiki/Streaming_Replication but the process
> seems to fail when I reach the point where I try to do a psql -c "SELECT
> pg_stop_backup()". It just says:
>
> NOTICE: pg_stop_backup cleanup done, waiting for required WAL segments to
> be archived
> WARNING: pg_stop_backup still waiting for all required WAL segments to be
> archived (60 seconds elapsed)
> HINT: Check that your archive_command is executing properly.
> pg_stop_backup can be canceled safely, but the database backup will not be
> usable without all the WAL segments.
> (...)
>
> When looking at ps aux on the master, I see the following:
>
> postgres 30930 0.0 0.0 98412 1632 ? Ss 15:59 0:02 postgres:
> archiver process failed on 0000000200000E1B000000A9
>
> The file mentioned is the one that it was about to archive, when the
> standby server failed. Somehow it must still be trying to "catch up" from
> that file which of cause isn't there any more, since I had to remove those
> in order to get more space on the HDD. Instead of trying to catch up from
> the last succeeded file, I want it to start over from scratch with the
> replication - I just don't know how.
>
>
That is because you manually removed some xlog, and you shouldn't ever do
that. To "cancel" the archiving, the better way (IMHO) is to set
archive_command to a dummy command, like:

archive_command = '/bin/true'

And reload PostgreSQL:

psql -c "SELECT pg_reload_conf()"

With that, PostgreSQL will stop archiving, and so you'll **be with no
backup at all**. With some archives removed, you can use your old
archive_command again and reload the server.

BTW, check why the archive_command is not working properly (look at PG's
log files). Is it because of no space left on disk? If so, removing some
may work.

Regards,
--
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Niels Kristian Schjødt 2013-06-10 18:02:40 Re: URGENT issue: pg-xlog growing on master!
Previous Message Jeff Janes 2013-06-10 17:53:06 Re: URGENT issue: pg-xlog growing on master!