From: | Linas Virbalas <linas(dot)virbalas(at)continuent(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com>, Euler Taveira de Oliveira <euler(at)timbira(dot)com>, Florian Pflug <fgp(at)phlo(dot)org> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Hot Backup with rsync fails at pg_clog if under load |
Date: | 2011-09-22 14:24:50 |
Message-ID: | CAA11FE2.1DDE2%linas.virbalas@continuent.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
>>> 2.2. pg_start_backup(Obackup_under_loadš) on the master (this will take a
>>> while as master is loaded up);
>>
>> No. if you use pg_start_backup('foo', true) it will be fast. Check the
>> manual.
>
> If the server is sufficiently heavily loaded that a checkpoint takes a
> nontrivial amount of time, the OP is correct that this will be not
> fast, regardless of whether you choose to force an immediate
> checkpoint.
In order to check more cases, I have changed the procedure to force an
immediate checkpoint, i.e. pg_start_backup('backup_under_load', true). With
the same load generator running, pg_start_backup returned almost
instantaneously compared to how long it took previously.
Most importantly, after doing this change, I cannot reproduce the pg_clog
error message anymore. In other words, with immediate checkpoint hot backup
succeeds under this load!
>>> 2.3. rsync data/global/pg_control to the standby;
>>
>> Why are you doing this? If ...
>>
>>> 2.4. rsync all other data/ (without pg_xlog) to the standby;
>>
>> you will copy it again or no? Don't understand your point.
>
> His point is that exercising the bug depends on doing the copying in a
> certain order. Any order of copying the data theoretically ought to
> be OK, as long as it's all between starting the backup and stopping
> the backup, but apparently it isn't.
Please note that in the past I was able to reproduce the same pg_clog error
even with taking a singular rsync of the whole data/ folder (i.e. without
splitting it into two steps).
>> The problem could be that the minimum recovery point (step 2.3) is different
>> from the end of rsync if you are under load.
Do you have ideas why does the Hot Backup operation with
pg_start_backup('backup_under_load', true) succeed while
pg_start_backup('backup_under_load') fails under the same load?
Originally, I was using pg_start_backup('backup_under_load') in order not to
clog the master server during the I/O required for the checkpoint. Of
course, now, it seems, this should be sacrificed for the sake of a
successful backup under load.
> It seems pretty clear that some relevant chunk of WAL isn't getting
> replayed, but it's not at all clear to me why not. It seems like it
> would be useful to compare the LSN returned by pg_start_backup() with
If needed, I could do that, if I had the exact procedure... Currently,
during the start of the backup I take the following information:
pg_xlogfile_name(pg_start_backup(...))
> the location at which replay begins when you fire up the clone.
As you have seen in my original message, in the pg_log I get only the
restored WAL file names after starting up the standby. Can I tune the
postgresql.conf to include the location at which replay begins in the log?
> Could you provide us with the exact rsync version and parameters you use?
rsync -azv
version 2.6.8 protocol version 29
--
Sincerely,
Linas Virbalas
http://flyingclusters.blogspot.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Stark | 2011-09-22 14:28:53 | Re: [v9.2] make_greater_string() does not return a string in some cases |
Previous Message | Kerem Kat | 2011-09-22 14:03:26 | Re: Adding CORRESPONDING to Set Operations |