From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Noah Misch <noah(at)leadboat(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Subject: | Re: AIO writes vs hint bits vs checksums |
Date: | 2024-10-30 02:21:32 |
Message-ID: | lxzj26ga6ippdeunz6kuncectr5gfuugmm2ry22qu6hcx6oid6@lzx3sjsqhmt6 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Attached is a, unfortunately long, series of patches implementing what I
described upthread.
0001 Add very basic test for kill_prior_tuples
We currently don't exercise this patch for gist and hash, which seems
somewhat criminal.
0002 heapam: Move logic to handle HEAP_MOVED into a helper function
A prep patch, which consolidates HEAP_MOVED handling into a helper. This
isn't strictly necessary, but I got bothered by it while writing this patch
series.
0003 bufmgr: Add BufferLockHeldByMe()
Useful for writing assertions in later patches.
0004 heapam: Use exclusive lock on old page in CLUSTER
When I started looking into this I wanted to make sure that it's actually
safe to skip setting hint bits in all places. One place that's *not* safe
is cluster - we assume that all hint bits are set in rewrite_heap_tuple().
This might not strictly be required, but it seemed very fragile as
is. While using an exclusive lock does seem like it makes it a bit safer,
it's not a very good fix, as we still dirty the old buffer, which seems
like a shame when one uses VACUUM FULL.
0005 heapam: Only set tuple's block once per page in pagemode
Minor, but measurable, optimization.
0006 heapam: Add batch mode mvcc check and use it in page mode
This is a good bit faster on its own, but is required to avoid a
performance regression later, when setting hint bits only only when no IO
going on at the same.
0007 bufmgr: Make it easier to change number of buffer state bits
This just makes it easier to change the division of bits in
BufferDesc->state.
0008 bufmgr: Add interface to acquire right to set hint bits
The main change. Adds BufferPrepareToSetHintBits() which, if true is
returned, gives the rigth to modify a buffer without an exclusive
lock. Once the modification is done BufferFinishSetHintBits() needs to be
called.
I dithered a bit on the name, we don't really call them hint bits in all
the places that modify buffers without an exlusive lock.
The new infra is unused as of this commit.
0009 heapam: Acquire right to set hint bits
Moves over heapam to the new interface.
0010 Acquire right to set hint bits in the remaining places
Moves indexams and freespace/ over to the new interface.
0011 bufmgr: Detect some missing BufferPrepareToSetHintBits() calls
Now that we don't set hint bits without calling
BufferPrepareToSetHintBits() first, we can detect many missing calls by
adding an assertion to MarkBufferDirtyHint().
0012 bufmgr: Don't copy pages while writing out
Finally remove the copy of the buffer while writing it out.
0013 WIP: bufmgr: Detect some bad buffer accesses
This is a lot more thorough infrastructure for detecting modifications of
buffers without appropriate locks, by using mprotect() to detect such
cases. Only works when not using huge pages, and not on windows (not sure
how mprotect() is spelled there) and not arm macos (due to 16kB pages).
Has two modes, one just disallows modifications, and is reasonably
cheap. The more expensive mode disallows any access when the buffer is
unpinned. The commit message has some performance numbers.
This has proven to be a good investment as far as this thread goes - I
wasn't initially aware that freespace modified buffers without a lock and
this quickly found that.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andrei Lepikhov | 2024-10-30 02:44:52 | Re: Cleanup SubPlanstate |
Previous Message | jian he | 2024-10-30 02:05:00 | Re: make all ereport in gram.y print out relative location |