Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
Date: 2016-01-16 13:07:37
Message-ID: CAB7nPqTA-ByticAAPZz77+RyxuHhTjK8fbp=AScdC0xjCU1zPA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Sat, Jan 16, 2016 at 9:07 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Sat, Jan 16, 2016 at 5:08 PM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
> wrote:
>>
>> On Sat, Jan 16, 2016 at 7:10 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
>> wrote:
>> > On Sun, Dec 20, 2015 at 6:44 PM, Michael Paquier
>> > <michael(dot)paquier(at)gmail(dot)com> wrote:
>> >> > Looking at the code, it occurred to me that the LSN position saved
>> >> > for
>> >> > a XLOG_SWITCH record is the last position of current segment, so we
>> >> > would still need to check if the current insert LSN matches the
>> >> > beginning of a new segment and if the last segment was forcibly
>> >> > switched by saving RecPtr of RequestXLogSwitch in XLogCtl for
>> >> > example.
>> >> > Thoughts?
>> >>
>> >> I haven't given up on this patch yet, and putting again my head on
>> >> this problem I have finished with the patch attached, which checks if
>> >> the current insert LSN position is at the beginning of a segment that
>> >> has just been switched to decide if a standby snapshot should be
>> >> logged or not. This allows bringing back an idle system to the pre-9.3
>> >> state where a segment would be archived in the case of a low
>> >> archive_timeout only when a checkpoint has been issued on the system.
>> >>
>> >
>> > Won't this be a problem if the checkpoint occurs after a long time and
>> > in
>> > the mean time there is some activity in the server?
>>
>> Why? If there is some activity on the server, the snapshot will be
>> immediately taken at the next iteration without caring about the
>> checkpoint.
>>
>
> + (insert_lsn % XLOG_SEG_SIZE) != SizeOfXLogLongPHD))
>
> Do you mean to intend that it is protected by above check in the
> patch?

Yep, in combination with is_switch_current to check if the last
completed segment was forcibly switched.

> Isn't it possible that so much WAL is inserted between bgwriter cycles,
> that when it checks the location of WAL, it founds it to be at the beginning
> of a new segment?

Er, no. Even if the insert LSN is at the beginning of a new segment,
this would take a standby snapshot if the last segment was not
forcibly switched.

>> > Another idea to solve this issue could be to see if there is any
>> > progress
>> > in the server by checking buffers dirtied/written (we can refer that
>> > information using pgBufferUsage) since last time we log this record in
>> > bgwriter.
>>
>> Yes, that may be an idea worth considering, but I really think that we
>> had better measure that at WAL level..
>
> I thought this is quite close to the previous patch you proposed where
> Andres wanted some measurement in terms of progress since last
> checkpoint. I understand as per current code your patch can work, but
> what if some more similar WAL records needs to be added?

A record like this one for the bgwriter? We could still rely on the
same check tracking the last-forced-segment, no?

It seems to me that tracking progress here is not really only a matter
of the number of shared buffers dirtied or written, we would need as
well to track XLOG_STANDBY_LOCK and provide for them fresh snapshots.
Imagine for example a read-only load on the master with some exclusive
locks taken on relations from time to time (perhaps unlikely so but
who knows?).
--
Michael

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2016-01-16 13:12:02 Re: BUG #13863: Select from views gives wrong results
Previous Message Master ZX 2016-01-16 12:51:36 Re[2]: [BUGS] BUG #13869: Right Join query that never ends

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-01-16 13:45:29
Previous Message Amit Kapila 2016-01-16 12:07:40 Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby