Re: [Patch] Optimize dropping of relation buffers using dlist

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, k(dot)jamison(at)fujitsu(dot)com, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Patch] Optimize dropping of relation buffers using dlist
Date: 2020-11-05 23:31:52
Message-ID: CA+hUKGKNMkrSApBn8GXCemYvGPGS-s+WSBA-b8gioiWx=D1F=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 5, 2020 at 10:47 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> I still feel 'cached' is a better name.

Amusingly, this thread is hitting all the hardest problems in computer
science according to the well known aphorism...

Here's a devil's advocate position I thought about: It's OK to leave
stray buffers (clean or dirty) in the buffer pool if files are
truncated underneath us by gremlins, as long as your system eventually
crashes before completing a checkpoint. The OID can't be recycled
until after a successful checkpoint, so the stray blocks can't be
confused with the blocks of another relation, and weird errors are
expected on a system that is in serious trouble. It's actually much
worse that we can give incorrect answers to queries when files are
truncated by gremlins (in the window of time before we presumably
crash because of EIO), because we're violating basic ACID principles
in user-visible ways. In this thread, discussion has focused on
availability (ie avoiding failures when trying to write back stray
buffers to a file that has been unlinked), but really a system that
can't see arbitrary committed transactions *shouldn't be available*.
This argument applies whether you think SEEK_END can only give weird
answers in the specific scenario I demonstrated with NFS, or whether
you think it's arbitrarily b0rked and reports random numbers: we
fundamentally can't tolerate that, so why are we trying to?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-11-05 23:36:32 Re: Move OpenSSL random under USE_OPENSSL_RANDOM
Previous Message Mohamed Wael Khobalatte 2020-11-05 22:42:47 Re: Why does to_json take "anyelement" rather than "any"?