From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Kevin Grittner <kgrittn(at)gmail(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Question about behavior of snapshot too old feature |
Date: | 2016-10-21 10:06:28 |
Message-ID: | CAD21AoD2r6HRg5Go4NGeo3mcERRbW2b+_bL35dzqC5k05VLp8A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Oct 17, 2016 at 10:04 PM, Kevin Grittner <kgrittn(at)gmail(dot)com> wrote:
> On Sun, Oct 16, 2016 at 9:26 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
>> When I set old_snapshot_threshold = 0 I got error at step #3, which
>> means that the error is occurred without table pruning.
>
> The "snapshot too old" error can happen without pruning, but only
> because there is no way to tell the difference between a page that
> has been pruned since the snapshot was taken and a page which has
> had some other kind of modification since the snapshot was taken.
>
> Ignoring false positives for a moment (where the page is updated by
> something other than pruning), what is required for early pruning
> is that the snapshot has expired (which due to "rounding" and
> avoidance of locking could easily take up to a minute or two more
> than the old_snapshot_threshold setting) and then there is page
> pruning due to a vacuum or just HOT pruning from a page read. At
> some point after that, a read which is part of returning data to
> the user (e.g., not just positioning for index modification) can
> see that the snapshot is too old and that the LSN for the page is
> past the snapshot LSN. That is when you get the error.
>> We have regression test for this feature but it sets
>> old_snapshot_threshold = 0, I doubt about we can test it properly.
>> Am I missing something?
>
> This is a hard feature to test properly, and certainly hard to test
> without the test running for a long time. The zero setting is
> really not intended to be used in production, but only to allow
> some half-way decent testing that doesn't take extreme lengths of
> time. If you add some delays of a few minutes each at key points
> in a test, you should be able to get a test that works with a
> setting of 1min. It is not impossible that we might need to add a
> memory barrier to one or two places to get such tests to behave
> consistently, but I have not been able to spot where, if anywhere,
> that would be.
Thank you for explanation! I understood.
When old_snapshot_threshold = 0, it skips to allocate shared memory
area for the xid array and skips the some logic in order to avoid
using the shared memory, so I was concerned about that a little.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Constantin S. Pan | 2016-10-21 10:08:58 | Re: Fun fact about autovacuum and orphan temp tables |
Previous Message | Pantelis Theodosiou | 2016-10-21 08:54:51 | Re: Indirect indexes |