Re: Enumize logical replication message actions

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: ashutosh(dot)bapat(at)2ndquadrant(dot)com
Cc: andres(at)anarazel(dot)de, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, amit(dot)kapila16(at)gmail(dot)com
Subject: Re: Enumize logical replication message actions
Date: 2020-10-23 01:08:44
Message-ID: 20201023.100844.960851079822542149.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Thu, 22 Oct 2020 16:37:18 +0530, Ashutosh Bapat <ashutosh(dot)bapat(at)2ndquadrant(dot)com> wrote in
> On Thu, 22 Oct 2020 at 14:46, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
> wrote:
>
> >
> >
> > We shouldn't have the default: in the switch() block in
> > apply_dispatch(). That prevents compilers from checking
> > completeness. The content of the default: should be moved out to after
> > the switch() block.
> >
> > apply_dispatch()
> > {
> > switch (action)
> > {
> > ....
> > case LOGICAL_REP_MSG_STREAM_COMMIT(s);
> > apply_handle_stream_commit(s);
> > return;
> > }
> >
> > ereport(ERROR, ...);
> > }
> >
> > > 0002 adds wrappers on top of pq_sendbyte() and pq_getmsgbyte() to send
> > and
> > > receive a logical replication message type respectively. These wrappers
> > add
> > > more protection to make sure that the enum definitions fit one byte. This
> > > also removes the default case from apply_dispatch() so that we can detect
> > > any LogicalRepMsgType not handled by that function.
> >
> > pg_send_logicalrep_msg_type() looks somewhat too-much. If we need
> > something like that we shouldn't do this refactoring, I think.
> >
>
> Enum is an integer, and we want to send byte. The function asserts that the
> enum fits a byte. If there's a way to declare byte long enums I would use
> that. But I didn't find a way to do that.

That way of defining enums can contain two different symbols with the
same value. If we need to check the values are actually in the range
of char, checking duplicate values has more importance from the
standpoint of likelihood.

AFAICS there're two instances of this kind of enums, CoreceionMethod
and TypeCat. None of them are not checked for width nor duplicates
when they are used.

Even if we need such a kind of check, it souldn't be a wrapper
function that adds costs on non-assertion builds, but a replacing of
pq_sendbyte() done only on USE_ASSERT_CHECKING.

> pg_get_logicalrep_msg_type() seems doing the same check (that the
> > value is compared aganst every keyword value) with
> > apply_dispatch(). Why do we need that function separately from
> > apply_dispatch()?
> >
> >
> The second patch removes the default case you quoted above. I think that's
> important to detect any unhandled case at compile time rather than at run
> time. But we need some way to detect whether the values we get from wire
> are legit. pg_get_logicalrep_msg_type() does that. Further that function
> can be used at places other than apply_dispatch() if required without each
> of those places having their own validation.

Even if that enum contains out-of-range values, that "command" is sent
having truncated to uint8 and on the receiver side apply_dispatch()
doesn't identify the command and raises an error. That is equivalent
to what pq_send_logicalrep_msg_type() does. (Also equivalent on the
point that symbols that are not used in regression are not checked.)

reagrds.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-10-23 01:20:11 Re: Enumize logical replication message actions
Previous Message Peter Geoghegan 2020-10-23 01:05:28 heapam and bottom-up garbage collection, keeping version chains short (Was: Deleting older versions in unique indexes to avoid page splits)