From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [Patch] Optimize dropping of relation buffers using dlist |
Date: | 2020-08-07 16:09:50 |
Message-ID: | 2756242.1596816590@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Fri, Aug 7, 2020 at 12:03 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Yeah, there is no room for "good enough" here. If a dirty buffer remains
>> in the system, the checkpointer will eventually try to flush it, and fail
>> (because there's no file to write it to), and then checkpointing will be
>> stuck. So we cannot afford to risk missing any buffers.
> This comment suggests another possible approach to the problem, which
> is to just make a note someplace in shared memory when we drop a
> relation. If we later find any of its buffers, we drop them without
> writing them out. This is not altogether simple, because (1) we don't
> have infinite room in shared memory to accumulate such notes and (2)
> it's not impossible for the OID counter to wrap around and permit the
> creation of a new relation with the same OID, which would be a problem
> if the previous note is still around.
Interesting idea indeed.
As for (1), maybe we don't need to keep the info in shmem. I'll just
point out that the checkpointer has *already got* a complete list of all
recently-dropped relations, cf pendingUnlinks in sync.c. So you could
imagine looking aside at that to discover that a dirty buffer belongs to a
recently-dropped relation. pendingUnlinks would need to be converted to a
hashtable to make searches cheap, and it's not very clear what to do in
backends that haven't got access to that table, but maybe we could just
accept that backends that are forced to flush dirty buffers might do some
useless writes in such cases.
As for (2), the reason why we have that list is that the physical unlink
doesn't happen till after the next checkpoint. So with some messing
around here, we could probably guarantee that every buffer belonging
to the relation has been scanned and deleted before the file unlink
happens --- and then, even if the OID counter has wrapped around, the
OID won't be reassigned to a new relation before that happens.
In short, it seems like maybe we could shove the responsibility for
cleaning up dropped relations' buffers onto the checkpointer without
too much added cost. A possible problem with this is that recycling
of those buffers will happen much more slowly than it does today,
but maybe that's okay?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2020-08-07 16:13:55 | Re: Parallel worker hangs while handling errors. |
Previous Message | Robert Haas | 2020-08-07 15:50:55 | Re: recovering from "found xmin ... from before relfrozenxid ..." |