Re: Bad recovery: no pg_xlog/RECOVERYXLOG

From: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Bad recovery: no pg_xlog/RECOVERYXLOG
Date: 2017-11-03 22:00:03
Message-ID: 45c0bedc-9ce8-b99b-80df-94a1180fbc88@catalyst.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Stephen,

On 03/11/17 00:11, Stephen Frost wrote:

>
> Sure, that'll work much of the time, but that's about like saying that
> PG could run without fsync being enabled much of the time and everything
> will be ok. Both are accurate, but hopefully you'll agree that PG
> really should always be run with fsync enabled.

It is completely different - this is a 'straw man' argument, and justs
serves to confuse this discussion.

>
>> Also, if what you are suggesting were actually the case, almost
>> everyone's streaming replication (and/or log shipping) would be
>> broken all the time.
> No, again, this isn't an argument about if it'll work most of the time
> or not, it's about if it's correct. PG without fsync will work most of
> the time too, but that doesn't mean it's actually correct.

No, it is pointing out that if your argument were correct, then there
should be the above side effects - there are not, which is significant.

The crux of your argument seems to be concerning the synchronization
between pg_basbackup finishing and being sure you have the required
archive logs. Now just so we are all clear, when pg_basebackup ends it
essentially calls do_pg_stop_backup (from xlog.c) which ensures that all
required WAL files are archived, or to be precise here makes sure
archive_command has been run successfully for each required WAL file.

Your entire argument seems about whether said WAL is fsync'ed to disk,
and how this is impossible to ensure in a shell script. Actually it is
possible quite simply: e.g suppose you archive command is:

rsync ... targetserver:/disk

There are several ways to get that to sync:

rsync .. targetserver:/disk && ssh target server sync

Alternatively amend  vm.dirty_bytes on targetserver to be < 16M, or
mount the /disk with sync option!

So it is clearly *possible*.

However, I think you are obsessing over the minutiae of fsync to single
server/disk when there are much more important (read likely to happen)
problems to consider. For me, the critical consideration is, not 'are
the WAL files there *right now*'..but 'will they be there tomorrow when
I need them for a restore'? Next is 'will they be the same/undamaged
when I read them tomorrow'?

This is why I'm *not* obsessing about fsyncing...make where you store
these WAL files *reliable*...either via proxying/ip splitting so you
send stuff to more that one server (if we are still thinking server +
disk = backup solution). Alternatively use a distributed object store
(Swift, S3 etc) that handle that for you, and in addition they checksum
and heal any individual node data corruption for you as well.
>> With respect to 'If I would like to develop etc etc..' - err, all I
>> was doing in this thread was helping the original poster make his
>> stuff a bit better - I'll continue to do that.
> Ignoring the basic requirements which I outlined isn't helping him get
> to a reliable backup system.

Actually I was helping him get a *reliable* backup system, I think you
misunderstood how swift changes the picture compared to a single
server/single disk design.

regards

Mark

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Günce Kaya 2017-11-06 07:19:33 Partitions
Previous Message Payal Singh 2017-11-03 15:01:02 Re: postgresql9.4 aws - no pg_upgrade