Re: Revise the Asserts added to bimapset manipulation functions

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Revise the Asserts added to bimapset manipulation functions
Date: 2024-01-02 07:18:07
Message-ID: CAMbWs4_6-CtV-AWK=pQ=RCNpbB_kLqH8FhedvC-3pxc=c4nGXA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Dec 31, 2023 at 6:44 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:

> On Fri, 29 Dec 2023 at 23:38, Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> > After applying this process to the first RestrictInfo, the bitmapset now
> > becomes {t2, t3}. Consequently, the second RestrictInfo also perceives
> > {t2, t3} as its required_relids. As a result, when attempting to remove
> > it from the joininfo lists, a problem arises, because it is not in t2's
> > joininfo list.
>
> Changing the relids so they reference t2 instead of t1 requires both
> bms_add_member() and bms_del_member(). The bms_add_member() will
> cause the set to be reallocated with my patch so I don't see why this
> case isn't covered.

Hmm, you're right. This particular case is covered by your patch. I
wondered if there might be another case where a bitmapset with more than
one reference is modified without being potentially required to be
reallocated. I'm not sure if there is such case in reality (or could be
introduced in the future), but if there is, I think it would be great if
REALLOCATE_BITMAPSETS could also help handle it.

> > Also, here are some review comments for the v2 patch:
> >
> > * There is no reallocation that happens in bms_add_members(), do we need
> > to call bms_copy_and_free() there?
>
> The spec I put in the comment at the top of bitmapset.c says:
>
> > have the code *always* reallocate the bitmapset when the
> > * set *could* be reallocated as a result of the modification
>
> Looking at bms_add_members(), it seems to me that the set *could* be
> reallocated as a result of the modification, per:
>
> if (a->nwords < b->nwords)
> {
> result = bms_copy(b);
> other = a;
> }
>
> In my view, that meets the spec I outlined.

I think one purpose of introducing REALLOCATE_BITMAPSETS is to help find
dangling pointers to Bitmapset's. From this point of view, I agree that
we should call bms_copy_and_free() in bms_add_members(), because the
bitmapset 'a' might be recycled (in-place modified, or pfreed).

According to this criterion, it seems to me that we should also call
bms_copy_and_free() in bms_join(), since both inputs there might be
recycled. But maybe I'm not understanding it correctly.

> > * Do you think we can add Assert(bms_is_valid_set(a)) for bms_free()?
>
> I did briefly have the Assert in bms_free(), but I removed it as I
> didn't think it was that useful to validate the set before freeing it.
> Certainly, there'd be an argument to do that, but I ended up not
> putting it there. I wondered if there could be a case where we do
> something like bms_int_members() which results in an empty set and
> after checking for and finding the set is now empty, we call
> bms_free(). If we did that then we'd Assert fail. In reality, we use
> pfree() instead of bms_free() as we already know the set is not NULL,
> so it wouldn't cause a problem for that particular case. I wondered if
> there might be another one that I didn't spot, so felt it was best not
> to Assert there.
>
> I imagine that if we found some case where the bms_free() Assert
> failed, we'd likely just remove it rather than trying to make the set
> valid before freeing it. So it seems pretty pointless if we'd opt to
> remove the Assert() at the first sign of trouble.

I think I understand your concern. In some bitmapset manipulation
functions, like bms_int_members(), the result might be computed as
empty. In such cases we need to free the input bitmap. If we use
bms_free(), the Assert would fail.

It seems to me that this scenario can only occur within the bitmapset
manipulation functions. Outside of bitmapset.c, I think it should be
quite safe to call bms_free() with this Assert. So I think it should
not have problem to have this Assert in bms_free() as long as we are
careful enough when calling bms_free() inside bitmapset.c.

Thanks
Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2024-01-02 08:18:00 Re: Extract numeric filed in JSONB more effectively
Previous Message Tatsuo Ishii 2024-01-02 06:39:25 INFORMATION_SCHEMA node