From: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used |
Date: | 2023-10-13 15:00:00 |
Message-ID: | 5df89a35-a985-b287-7210-6ed7d2c31803@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hello Tom,
13.10.2023 00:01, Tom Lane wrote:
> I came back to this question today, and after more thought I'm going
> to cast my vote for "let's not trust SearchSysCacheExists()+lookup".
> It just seems like that's too fragile. The case reported in this
> thread of it failing in parallel workers but not main should scare
> anybody, and the future course of development seems far more likely
> to introduce new problems than remove them.
Thanks for digging into this!
> I spent some time looking through existing SearchSysCacheExists calls,
> and I could only find two sets of routines where we seem to be
> depending on SearchSysCacheExists to protect a subsequent lookup
> somewhere else, and there isn't any lock on the object in question.
> Those are the has_foo_privilege functions discussed here, and the
> foo_is_visible functions near the bottom of namespace.c. I'm not
> sure why we've not heard complaints traceable to the foo_is_visible
> family. Maybe nobody has tried hard to break them, or maybe they
> are just less likely to be used in ways that are at risk.
I'll try to research/break xxx_is_visible and share my findings tomorrow.
> It turns out to not be that hard to get rid of the problem in the
> has_foo_privilege family, as per attached patches. I've not looked
> at fixing the foo_is_visible family, but that probably ought to be a
> separate patch anyway.
Yeah, the attached patches greatly improve consistency. The only
inconsistency I've found in the patches is a missing comma here:
+ * Exported generic routine for checking a user's access privileges to an object
+ * with is_missing
You removed "this case is not a user-facing error, so elog not ereport",
and I think it's good for consistency too, as all aclmask/aclcheck
functions now use ereport().
> BTW, while nosing around I found what seems like a very nasty related
> bug. Suppose that a catalog tuple being loaded into syscache contains
> some toasted fields. CatalogCacheCreateEntry will flatten the tuple,
> involving fetches from toast tables that will certainly cause
> AcceptInvalidationMessages calls. What if one of those should have
> invalidated this tuple? We will not notice, because it's not in
> the hashtable yet. When we do add it, we will mark it not-dead,
> meaning that the stale entry looks fine and could persist for a long
> while.
Yeah, it's an interesting case that needs investigation, IMO.
I'll try to look into this and construct the test case in the background.
> 0001's changes in acl.c are straightforward, but it's worth noting
> that the has_sequence_privilege functions hadn't gotten the memo
> about not failing when a bogus relation OID is passed.
I've looked through all functions has_*priv*_id and all they have
"if (is_missing) PG_RETURN_NULL()" now (with the patches applied).
Best regards,
Alexander
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2023-10-14 02:34:43 | Re: BUG #18130: \copy fails with "could not read block" or "page should be empty but not" errors due to triggers |
Previous Message | PG Bug reporting form | 2023-10-13 14:26:56 | BUG #18156: Self-referential foreign key in partitioned table not enforced on deletes |