From: | Vladimir Borodin <root(at)simply(dot)name> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Dmitriy Sarafannikov <dsarafannikov(at)yandex(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Broken hint bits (freeze) |
Date: | 2017-05-25 06:05:20 |
Message-ID: | D7B95626-BF11-4E7E-AF10-0AB4B5BE9E79@simply.name |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> 24 мая 2017 г., в 15:44, Robert Haas <robertmhaas(at)gmail(dot)com> написал(а):
>
> On Wed, May 24, 2017 at 7:27 AM, Dmitriy Sarafannikov
> <dsarafannikov(at)yandex(dot)ru> wrote:
>> It seems like replica did not replayed corresponding WAL records.
>> Any thoughts?
>
> heap_xlog_freeze_page() is a pretty simple function. It's not
> impossible that it could have a bug that causes it to incorrectly skip
> records, but it's not clear why that wouldn't affect many other replay
> routines equally, since the pattern of using the return value of
> XLogReadBufferForRedo() to decide what to do is widespread.
>
> Can you prove that other WAL records generated around the same time as
> the freeze record *were* replayed on the master? If so, that proves
> that this isn't just a case of the WAL never reaching the standby.
> Can you look at the segment that contains the relevant freeze record
> with pg_xlogdump? Maybe that record is messed up somehow.
Not yet. Most of such cases are long before our recovery window so corresponding WALs have been deleted. We have already tuned retention policy and we are now looking for a fresh case.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
--
May the force be with you…
https://simply.name
From | Date | Subject | |
---|---|---|---|
Next Message | Jeevan Ladhe | 2017-05-25 06:40:22 | Re: Adding support for Default partition in partitioning |
Previous Message | amul sul | 2017-05-25 04:29:00 | Re: [POC] hash partitioning |