From: | "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Subject: | Re: [PATCH] Refactoring of LWLock tranches |
Date: | 2015-11-16 00:20:28 |
Message-ID: | 20151116002028.GA24814@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2015-11-15 16:24:24 -0500, Robert Haas wrote:
> I think it needs to be adapted to use predefined constants for the
> tranche IDs instead of hardcoded values, maybe based on the attached
> tranche-constants.patch.
Yea, that part is clearly not ok. Let me look at the patch.
> Also, I think we should rip all the volatile qualifiers out of
> bufmgr.c, using the second attached patch, devolatileize-bufmgr.patch.
> The comment claiming that we need this because spinlocks aren't
> compiler barriers is obsolete.
Same here.
> @@ -457,7 +457,7 @@ CreateLWLocks(void)
> LWLockCounter = (int *) ((char *) MainLWLockArray - 3 * sizeof(int));
> LWLockCounter[0] = NUM_FIXED_LWLOCKS;
> LWLockCounter[1] = numLocks;
> - LWLockCounter[2] = 1; /* 0 is the main array */
> + LWLockCounter[2] = LWTRANCHE_LAST_BUILTIN_ID + 1;
> }
Man this LWLockCounter[0] stuff is just awful. Absolutely nothing that
needs to be fixed here, but here it really should be fixed someday.
> /*
> + * We reserve a few predefined tranche IDs. These values will never be
> + * returned by LWLockNewTrancheId.
> + */
> +#define LWTRANCHE_MAIN 0
> +#define LWTRANCHE_BUFFER_CONTENT 1
> +#define LWTRANCHE_BUFFER_IO_IN_PROGRESS 2
> +#define LWTRANCHE_LAST_BUILTIN_ID LWTRANCHE_BUFFER_IO_IN_PROGRESS
Nitpick: I'm inclined to use an enum to avoid having to adjust the last
builtin id when adding a new builtin tranche.
Besides that minor thing I think this works for me. We might
independently want something making adding non-builtin tranches easier,
but that's really just independent.
> From 9e7f9219b5e752da46be0e26a0be074191ae8f62 Mon Sep 17 00:00:00 2001
> From: Robert Haas <rhaas(at)postgresql(dot)org>
> Date: Sun, 15 Nov 2015 10:24:03 -0500
> Subject: [PATCH 1/3] De-volatile-ize.
I very strongly think this should be done. It's painful having to
cast-away volatile, and it de-optimizes a fair bit of performance
relevant code.
> /* local state for StartBufferIO and related functions */
> /* local state for StartBufferIO and related functions */
> -static volatile BufferDesc *InProgressBuf = NULL;
> +static BufferDesc *InProgressBuf = NULL;
> static bool IsForInput;
>
> /* local state for LockBufferForCleanup */
> -static volatile BufferDesc *PinCountWaitBuf = NULL;
> +static BufferDesc *PinCountWaitBuf = NULL;
Hm. These could also be relevant for sigsetjmp et al. Haven't found
relevant code tho.
> - volatile BufferDesc *bufHdr;
> + BufferDesc *bufHdr;
> Block bufBlock;
Looks mis-indented now, similarly in a bunch of other places. Maybe
pg-indent afterwards?
> /*
> * Header spinlock is enough to examine BM_DIRTY, see comment in
> @@ -1707,7 +1707,7 @@ BufferSync(int flags)
> num_written = 0;
> while (num_to_scan-- > 0)
> {
> - volatile BufferDesc *bufHdr = GetBufferDescriptor(buf_id);
> + BufferDesc *bufHdr = GetBufferDescriptor(buf_id);
>
This case has some theoretical behavioural implications: As
bufHdr->flags is accessed without a spinlock the compiler might re-use
an older value. But that's ok: a) there's no old value it really could
use b) the whole race-condition exists anyway, and the comment in the
body explains why that's ok.
> BlockNumber
> BufferGetBlockNumber(Buffer buffer)
> {
> - volatile BufferDesc *bufHdr;
> + BufferDesc *bufHdr;
>
> Assert(BufferIsPinned(buffer));
>
> @@ -2346,7 +2346,7 @@ void
> BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
> BlockNumber *blknum)
> {
> - volatile BufferDesc *bufHdr;
> + BufferDesc *bufHdr;
> /* Do the same checks as BufferGetBlockNumber. */
> Assert(BufferIsPinned(buffer));
> @@ -2382,7 +2382,7 @@ BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
> * as the second parameter. If not, pass NULL.
> */
> static void
> -FlushBuffer(volatile BufferDesc *buf, SMgrRelation reln)
> +FlushBuffer(BufferDesc *buf, SMgrRelation reln)
> {
> XLogRecPtr recptr;
> ErrorContextCallback errcallback;
> @@ -2520,7 +2520,7 @@ RelationGetNumberOfBlocksInFork(Relation relation, ForkNumber forkNum)
> bool
> BufferIsPermanent(Buffer buffer)
> {
> - volatile BufferDesc *bufHdr;
> + BufferDesc *bufHdr;
These require that the buffer is pinned (a barrier implying
operation). The pin prevents the contents from changing in a relevant
manner, the barrier implied in the pin forces the core's view to be
non-stale.
> @@ -2613,7 +2613,7 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber forkNum,
>
> for (i = 0; i < NBuffers; i++)
> {
> - volatile BufferDesc *bufHdr = GetBufferDescriptor(i);
> + BufferDesc *bufHdr = GetBufferDescriptor(i);
> /*
> * We can make this a tad faster by prechecking the buffer tag before
> @@ -2703,7 +2703,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
> for (i = 0; i < NBuffers; i++)
> {
> RelFileNode *rnode = NULL;
> - volatile BufferDesc *bufHdr = GetBufferDescriptor(i);
> + BufferDesc *bufHdr = GetBufferDescriptor(i);
>
> /*
> * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
> @@ -2767,7 +2767,7 @@ DropDatabaseBuffers(Oid dbid)
>
> for (i = 0; i < NBuffers; i++)
> {
> - volatile BufferDesc *bufHdr = GetBufferDescriptor(i);
> + BufferDesc *bufHdr = GetBufferDescriptor(i);
> @@ -2863,7 +2863,7 @@ void
> FlushRelationBuffers(Relation rel)
> {
> int i;
> - volatile BufferDesc *bufHdr;
> + BufferDesc *bufHdr;
>
> /* Open rel at the smgr level if not already done */
> RelationOpenSmgr(rel);
> @@ -2955,7 +2955,7 @@ void
> FlushDatabaseBuffers(Oid dbid)
> {
> int i;
> - volatile BufferDesc *bufHdr;
> + BufferDesc *bufHdr;
These all are correct based on the premise that some heavy lock -
implying a barrier - prevents new blocks with relevant tags from being
loaded into s_b. And it's fine if some other backend concurrently writes
out the buffer before we do - we'll notice that in the re-check after
So, looks good to me.
Andres
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2015-11-16 01:24:36 | Re: proposal: multiple psql option -c |
Previous Message | Tom Lane | 2015-11-16 00:14:12 | Re: check for interrupts in set_rtable_names |