From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Cc: | Noah Misch <noah(at)leadboat(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Injection point locking |
Date: | 2024-07-09 05:16:13 |
Message-ID: | ZozHnQ5QZSjI4CPH@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jul 08, 2024 at 04:21:37PM +0300, Heikki Linnakangas wrote:
> I came up with the attached. It replaces the shmem hash table with an array
> that's scanned linearly. On each slot in the array, there's a generation
> number that indicates whether the slot is in use, and allows detecting
> concurrent modifications without locks. The attach/detach operations still
> hold the LWLock, but InjectionPointRun() is now lock-free, so it can be used
> without a PGPROC entry.
Okay, noted.
> It's now usable from postmaster too. However, it's theoretically possible
> that if shared memory is overwritten with garbage, the garbage looks like a
> valid injection point with a name that matches one of the injection points
> that postmaster looks at. That seems unlikely enough that I think we can
> accept the risk. To close that gap 100% I think a GUC is the only solution.
This does not worry me much, FWIW.
+ * optimization to.avoid scanning through the whole entry, in the common case
s/to.avoid/to avoid/
+ * generation counter on each entry to to allow safe, lock-free reading.
s/to to/to/
+ * we're looking for is concurrently added or remoed, we might or might
s/remoed/removed/
+ if (max_inuse == 0)
+ {
+ if (InjectionPointCache)
+ {
+ hash_destroy(InjectionPointCache);
+ InjectionPointCache = NULL;
+ }
+ return false;
In InjectionPointCacheRefresh(), this points to nothing in the cache,
so it should return NULL not false, even both are 0.
typedef struct InjectionPointCacheEntry
{
char name[INJ_NAME_MAXLEN];
+ int slot_idx;
+ uint64 generation;
char private_data[INJ_PRIVATE_MAXLEN];
InjectionPointCallback callback;
} InjectionPointCacheEntry;
May be worth mentioning that generation is a copy of
InjectionPointEntry's generation cross-checked at runtime with the
shmem entry to see if we have a cache consistent with shmem under the
same point name.
+ generation = pg_atomic_read_u64(&entry->generation);
+ if (generation % 2 == 0)
+ continue;
In the loops of InjectionPointCacheRefresh() and
InjectionPointDetach(), perhaps this should say that the slot is not
used hence skipped when generation is even.
InjectionPointDetach() has this code block at its end:
if (!found)
return false;
return true;
Not the fault of this patch, but this can just return "found".
The tricks with max_inuse to make the shmem lookups cheaper are
interesting.
+ pg_read_barrier();
+ if (memcmp(entry->name, name, namelen + 1) != 0)
+ continue;
Why this barrier when checking the name of a shmem entry before
reloading it in the local cache? Perhaps the reason should be
commented?
+ pg_read_barrier();
+ if (pg_atomic_read_u64(&entry->generation) != generation)
+ continue; /* was detached concurrently */
+
+ return injection_point_cache_load(&local_copy, idx, generation);
So, in InjectionPointCacheRefresh(), when a point is loaded into the
local cache for the first time, the read of "generation" is the
tipping point: it is possible to take a breakpoint at the beginning of
injection_point_cache_load(), detach then attach the point. What
matters is that we are going to use the data in local_copy, even if
shmem may have something entirely different. Hmm. Okay. It is a bit
annoying that the entry is just discarded and ignored if the local
copy and shmem generations don't match? Could it be more
user-friendly to go back to the beginning of ActiveInjectionPoints and
re-check the whole rather than return a NULL callback?
- if (private_data != NULL)
- memcpy(entry->private_data, private_data, INJ_PRIVATE_MAXLEN);
+ memcpy(entry->private_data, private_data, INJ_PRIVATE_MAXLEN);
private_data could be NULL, hence why the memcpy()?
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiro.Ikeda | 2024-07-09 05:19:49 | RE: Is it expected behavior index only scan shows "OPERATOR(pg_catalog." for EXPLAIN? |
Previous Message | Amit Kapila | 2024-07-09 05:05:37 | Re: Slow catchup of 2PC (twophase) transactions on replica in LR |