Re: Show various offset arrays for heap WAL records

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Subject: Re: Show various offset arrays for heap WAL records
Date: 2023-04-10 22:03:56
Message-ID: CAAKRu_ZvGUSuy-RTzRg7VVC_oiTyS=LduHSaKr1h+6Pn82KGAQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Apr 9, 2023 at 8:12 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Fri, Apr 7, 2023 at 4:46 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > Pushed that one too.
>
> I noticed that the nbtree VACUUM and DELETE record types have their
> update/xl_btree_update arrays output incorrectly. We cannot use the
> generic array_desc() approach with xl_btree_update elements, because
> they're variable-width elements. The problem is that array_desc() only deals
> with fixed-width elements.

You are right. I'm sorry for the rather egregious oversight.

I took a look at the first patch even though you've pushed the bugfix
part. Any reason you didn't use array_desc() for the inner array (of
"ptids")? I find that following the pattern of using array_desc (when it
is correct, of course!) helps me to quickly identify: "okay, this is an
array of x" without having to stare at the loop too much.

I will say that the prefix of p in "ptid" makes it sound like pointer to
a tid, which I don't believe is what you meant.

> I also changed some of the details around whitespace in arrays in the
> fixup patch (though I didn't do the same with objects). It doesn't
> seem useful to use so much whitespace for long arrays of integers
> (really page offset numbers). And I brought a few nbtree desc routines
> that still used ";" characters as punctuation in line with the new
> convention.

Cool.

> Finally, the patch revises the guidelines written for rmgr desc
> routine authors. I don't think that we need to describe how to handle
> outputting whitespace in detail. It'll be quite natural for other
> rmgrs to use existing facilities such as array_desc() themselves,
> which makes whitespace type inconsistencies unlikely. I've tried to
> make the limits of the guidelines clear. The main goal is to avoid
> gratuitous inconsistencies, and to provide a standard way of doing
> things that many different rmgrs are likely to want to do, again and
> again. But individual rmgrs still have a certain amount of discretion,
> which seems like a good thing to me (the alternative requires that we
> fix at least a couple of things in nbtdesc.c and in heapdesc.c, which
> doesn't seem useful to me).

I like the new guidelines you proposed (in the patch).
They are well-written and clear.

On Mon, Apr 10, 2023 at 3:18 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Sun, Apr 9, 2023 at 5:12 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > I noticed that the nbtree VACUUM and DELETE record types have their
> > update/xl_btree_update arrays output incorrectly. We cannot use the
> > generic array_desc() approach with xl_btree_update elements, because
> > they're variable-width elements. The problem is that array_desc() only deals
> > with fixed-width elements.
>
> I pushed this fix just now, though without the updates to the
> guidelines (or only minimal updates).
>
> A remaining problem with arrays appears in "infobits" fields for
> record types such as LOCK. Here's an example of the problem:
>
> off: 34, xid: 3, flags: 0x00, infobits: [, LOCK_ONLY, EXCL_LOCK ]
>
> Clearly the punctuation from the array is malformed.

So, I did do this on purpose -- because I didn't want to have to do the
gymnastics to determine which flag was hit first (though it looks like I
mistakenly omitted the comma prepending IS_MULTI -- that was not
intentional).
I recognized that the output doesn't look nice, but I hadn't exactly
thought of it as malformed. Perhaps you are right.

I will say and I am still not a fan of the "if (first) else" logic in
your attached patch.

I've put my suggestion for how to do it instead inline with the code
diff below for clarity.

diff --git a/src/backend/access/rmgrdesc/heapdesc.c
b/src/backend/access/rmgrdesc/heapdesc.c
index 3bd083875..a64d14c2c 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -18,29 +18,75 @@
#include "access/rmgrdesc_utils.h"

static void
-out_infobits(StringInfo buf, uint8 infobits)
+infobits_desc(StringInfo buf, uint8 infobits, const char *keyname)
...
if (infobits & XLHL_KEYS_UPDATED)
- appendStringInfoString(buf, ", KEYS_UPDATED");
+ {
+ if (first)
+ appendStringInfoString(buf, "KEYS_UPDATED");
+ else
+ appendStringInfoString(buf, ", KEYS_UPDATED");
+ first = false;
+ }

How about we have the flags use a trailing comma and space and then
overwrite the last one with something this:

if (infobits & XLHL_KEYS_UPDATED)
appendStringInfoString(buf, "KEYS_UPDATED, ");
buf->data[buf->len -= strlen(", ")] = '\0';

@@ -230,7 +271,9 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
OffsetNumber *offsets;

I don't prefer this to what I had, which is also correct, right?

plans = (xl_heap_freeze_plan *)
XLogRecGetBlockData(record, 0, NULL);
- offsets = (OffsetNumber *) &plans[xlrec->nplans];
+ offsets = (OffsetNumber *) ((char *) plans +
+ (xlrec->nplans *
+ sizeof(xl_heap_freeze_plan)));
appendStringInfoString(buf, ", plans:");
array_desc(buf, plans, sizeof(xl_heap_freeze_plan), xlrec->nplans,
&plan_elem_desc, &offsets);

> A second issue (related to the first) is the name of the key itself,
> "infobits". While "infobits" actually seems fine in this particular
> example, I don't think that we want to do the same for record types
> such as HEAP_UPDATE, since such records require that the description
> show information about flags whose underlying field in the WAL record
> struct is actually called "old_infobits_set". I think that we should
> be outputting "old_infobits: [ ... ] " in the description of
> HEAP_UPDATE records, which isn't the case right now.

--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -18,29 +18,75 @@
#include "access/rmgrdesc_utils.h"

static void
-out_infobits(StringInfo buf, uint8 infobits)
+infobits_desc(StringInfo buf, uint8 infobits, const char *keyname)

I like the keyname parameter.

> Note that the patch makes many individual (say) HOT_UPDATE records
> have descriptions that look like this:
>
> ... old_infobits: [], ...
>
> This differs from HEAD, where the output is totally suppressed because
> there are no flag bits to show. I think that this behavior is more
> logical and consistent overall.

Yea, I think it is better to include things and show that they are empty
then omit them. I find it more clear.

- Melanie

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-04-10 22:27:38 Re: When to drop src/tools/msvc support
Previous Message Tom Lane 2023-04-10 20:50:20 Re: When to drop src/tools/msvc support