Re: Re: pglogical_output - a general purpose logical decoding output plugin

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Tomasz Rybak <tomasz(dot)rybak(at)post(dot)pl>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: pglogical_output - a general purpose logical decoding output plugin
Date: 2016-01-20 05:54:10
Message-ID: CAMsr+YHdD=GKMSZPz6u-YOe4VtrkaCTJWr09tHHHUHoPQ9iGMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 20 January 2016 at 06:23, Tomasz Rybak <tomasz(dot)rybak(at)post(dot)pl> wrote:

> The following review has been posted through the commitfest application:
>

Thanks!

>
> + /* Protocol capabilities */
> + #define PGLOGICAL_PROTO_VERSION_NUM 1
> + #define PGLOGICAL_PROTO_MIN_VERSION_NUM 1
> Is this protocol version number and minimal recognized version number,
> or major and minor version number? I assume that
> PGLOGICAL_PROTO_VERSION_NUM
> is current protocol version (as in config max_proto_version and
> min_proto_version). But why we have MIN_VERSION and not MAX_VERSION?
>
> From code in pglogical_output.c (lines 215-225 it looks like
> PGLOGICAL_PROTO_VERSION_NUM is maximum, and PGLOGICAL_PROTO_MIN_VERSION_NUM
> is treated as minimal protocol version number.
>
> I can see that those constants are exported in pglogical_infofuncs.c and
> pglogical.sql, but I do not understand omission of MAX.
>

Thanks for stopping to think about this. It's one of the areas I really
want to get right but I'm not totally confident of.

The idea here is that we want downwards compatibility as far as possible
and maintainable but we can't really be upwards compatible for breaking
protocol revisions. So the output plugin's native protocol version is
inherently the max protocol version and we don't need a separate MAX.

The downstream connects and declares to the upstream "I speak protocol 2
through 3". The upstream sees this and replies "I speak protocol 1 through
2. We have protocol 2 in common so I will use that." Or alternately replies
with an error "I only speak protocol 1 so we have no protocol in common".
This is done via the initial parameters passed by the downstream to the
logical decoding plugin and then via the startup reply message that's the
first message on the logical decoding stream.

We can't do a better handshake than this because the underlying walsender
protocol and output plugin API only gives us one chance to send free-form
information to the output plugin and it has to do so before the output
plugin can send anything to the downstream.

As much as possible I want to avoid ever needing to do a protocol bump
anyway, since it'll involve adding conditionals and duplication. That's why
the protocol tries so hard to be extensible and rely on declared
capabilities rather than protocol version bumps. But I'd rather plan for it
than be unable to ever do it in future.

Much (all?) of this is discussed in the protocol docs. I should probably
double check that and add a comment that refers to them there.

> + typedef struct HookFuncName
>

Thanks. That's residue from the prior implementation of hooks, which used
direct pg_proc lookups and cached the FmgrInfo in a dynahash. It's no
longer required now that we're using a single hook entry point and direct C
function calls. Dead code, removed.

> + typedef struct PGLogicalTupleData
> I haven't found those used anything, and they are not mentioned in
> documentation. Are those structures needed?
>

That snuck in from the pglogical downstream during the split into a
separate tree. It's dead code as far as pglogical_output is concerned.
Removed.

> + pglogical_config.c:
> + switch(get_param_key(elem->defname))
> + {
> + val = get_param_value(elem, false,
> OUTPUT_PARAM_TYPE_UINT32);

Why do we need this line here? All cases contain some variant of
> val = get_param_value(elem, false, TYPE);
> as first line after "case PARAM_*:" so this line should be removed.
>

Correct. That seems to be an escapee editing error. Thanks, removed.

> + val = get_param(options, "startup_params_format", false, false,
> + OUTPUT_PARAM_TYPE_UINT32, &found);
> +
> + params_format = DatumGetUInt32(val);
> Shouldn't we check "found" here? We work with val (which is Datum(0)) -
> won't
> it throw SIGFAULT, or similar?
>

get_param is called with missing_ok=false so it ERRORs and can never return
!found . In any case it'd return (Datum)0 so we'd just get (uint32)0 not a
crash.

To make this clearer I've changed get_param so it supports NULL as a value
for found.

> Additionally - I haven't found any case where code would use "found"
> passed from get_param() - so as it's not used it might be removed.
>

Probably, but I expect it to be useful later. It was used before a
restructure of how params get read. I don't mind removing it if you feel
strongly about it, but it'll probably just land up being added back at some
point.

>
>
+ elog(DEBUG1, "Binary mode rejected: Server and client
> endian sizeof(long) mismatch");
> Isn't "endian" here case of copy-paste from first error?
>

Yes, it is. Thanks.

> + static void pg_decode_shutdown(LogicalDecodingContext * ctx)
> In pg_decode_startup we create main memory context, and create hooks memory
> context. In pg_decode_shutdown we delete hooks memory context but not main
> context. Is this OK, or should we also add:
> MemoryContextDelete(data->context);
> <http://www.postgresql.org/mailpref/pgsql-hackers>
>

Good catch. I think a better fix is to make it a child of the logical
decoding context so it's deleted automatically. It's actually unnecessary
to delete data->hooks_mctxt here for the same reason.

Amended.

I've also patched the tests to handle the failure to fail on an
incorrect startup_params_format .

Changes pushed to
https://github.com/2ndquadrant/postgres/tree/dev/pglogical-output

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2016-01-20 06:05:01 Re: Support for N synchronous standby servers - take 2
Previous Message Pavel Stehule 2016-01-20 05:48:52 Re: Rethinking TRANSFORM FOR TYPE ...