Re: type cache cleanup improvements

From: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>
Subject: Re: type cache cleanup improvements
Date: 2024-08-21 13:28:13
Message-ID: CALT9ZEFHxez3Y61nsV4tNK+=HdBqj4OeZQk0UOwecGWqFWnjbQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, Alexander!

On Tue, 20 Aug 2024 at 23:01, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
wrote:

> On Mon, Aug 5, 2024 at 4:16 AM Alexander Korotkov <aekorotkov(at)gmail(dot)com>
> wrote:
> > I've revised the patchset. First of all, I've re-ordered the patches.
> >
> > 0001-0002 (former 0002-0003)
> > Comprises hash_search_with_hash_value() function and its application
> > to avoid full hash iteration in InvalidateAttoptCacheCallback() and
> > TypeCacheTypCallback(). I think this is quite straightforward
> > optimization without negative side effects. I've revised comments,
> > commit message and did some code beautification. I'm going to push
> > this if no objections.
> >
> > 0003 (former 0001)
> > I've revised this patch. I think main concerns expressed in the
> > thread about this path is that we don't have invalidation mechanism
> > for relid => typid map. Finally due to oid wraparound same relids
> > could get reused. That could lead to invalid entries in the map about
> > existing relids and typeids. This is rather messy, but I don't think
> > this could cause a material bug. The maps items are used only for
> > cache invalidation. Extra invalidation doesn't cause a bug. If type
> > with same relid will be cached, then correspoding map item will be
> > overridden, so no missing invalidation. However, I see the following
> > reasons for keeping consistent state of relid => typid map.
> >
> > 1) As the main use-case for this optimization is flood of temporary
> > tables, it would be nice not let relid => typid map bloat in this
> > case. I see that TypeCacheHash would get bloated, because its entries
> > are never deleted. However, I would prefer to not get this situation
> > even worse.
> > 2) In future we may find some more use-cases for relid => typid map
> > besides cache invalidation. Keeping that in consistent state could be
> > advantage then.
> >
> > In the attached patch, I'm keeping relid => typid map when
> > corresponding typentry have either TCFLAGS_HAVE_PG_TYPE_DATA, or
> > TCFLAGS_OPERATOR_FLAGS, or tupdesc. Thus, when temporary table gets
> > deleted, we would invalidate the map item.
> >
> > It will be also nice to get rid of iteration over all the cached
> > domain types in TypeCacheRelCallback(). However, this typically
> > shouldn't be a problem since domain types are less tended to bloat.
> > Domain types are created manually, unlike composite types which are
> > automatically created for every temporary table. We will probably
> > need to optimize this in future, but I don't feel this to be necessary
> > in present patch.
> >
> > I think the revised 0003 requires review.
>
> The rebased remaining patch is attached.
>
I've looked at patch v8.

1.
In function check_insert_rel_type_cache() the block:

+#ifdef USE_ASSERT_CHECKING
+
+ /*
+ * In assert-enabled builds otherwise check for
RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+ }
+#endif

As I understand it does HASH_FIND after the same value just inserted by
HASH_ENT
ER above under the same if condition:

if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)

Why do we need to do this re-check HASH_ENTER? Also I see "otherwise" in
comment in a quoted block, but if condition is the same.

2.
For function check_delete_rel_type_cache():
I'd modify the block:
+#ifdef USE_ASSERT_CHECKING
+
+ /*
+ * In assert-enabled builds otherwise check for
RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS) ||
+ typentry->tupDesc != NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+ }
+#endif

as:
+
+ /*
+ * In assert-enabled builds otherwise check for
RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ else
+{
+ #ifdef USE_ASSERT_CHECKING
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+#endif
+}

3. I think check_delete_rel_type_cache and check_insert_rel_type_cache are
better to be renamed to be more clear, though I don't have exact proposals
yet,
4. I haven't looked into comments, though I'd recommend oid -> OID
replacement in the comments.

Thank you for working on this patchset!

Regards,
Pavel Borisov
Supabase

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-08-21 13:34:19 Re: Cirrus CI for macOS branches 16 and 15 broken
Previous Message Peter Eisentraut 2024-08-21 13:19:58 Re: Requiring LLVM 14+ in PostgreSQL 18