Re: COPY IN/BOTH vs. extended query mode

From: Craig Ringer <craig(dot)ringer(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY IN/BOTH vs. extended query mode
Date: 2017-02-13 21:29:48
Message-ID: CAMsr+YGvp2wRx9pPSxaKFdaObxX8DzWse+OkWk2xpXSvT0rq-g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 14 Feb. 2017 06:15, "Robert Haas" <robertmhaas(at)gmail(dot)com> wrote:

On Mon, Jan 23, 2017 at 9:12 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> According to the documentation for COPY IN mode, "If the COPY command
> was issued via an extended-query message, the backend will now discard
> frontend messages until a Sync message is received, then it will issue
> ReadyForQuery and return to normal processing." I added a similar
> note to the documentation for COPY BOTH mode in
> 91fa8532f4053468acc08534a6aac516ccde47b7, and the documentation
> accurately describes the behavior of the server. However, this seems
> to make fully correct error handling for clients using libpq almost
> impossible, because PQsendQueryGuts() sends
> Parse-Bind-Describe-Execute-Sync in one shot without regard to whether
> the command that was just sent invoked COPY mode (cf. the note in
> CopyGetData about why we ignore Flush and Sync in that function).
>
> So imagine that the client uses libpq to send (via the extended query
> protocol) a COPY IN command (or some hypothetical command that starts
> COPY BOTH mode to begin). If the server throws an error before the
> Sync message is consumed, it will bounce back to PostgresMain which
> will set doing_extended_query_message = true after which it will
> consume messages, find the Sync, reset that flag, and send
> ReadyForQuery. On the other hand, if the server enters CopyBoth mode,
> consumes the Sync message in CopyGetData (or a similar function), and
> *then* throws an ERROR, the server will wait for a second Sync message
> from the client before issuing ReadyForQuery. There is no sensible
> way of coping with this problem in libpq, because there is no way for
> the client to know which part of the server code consumed the Sync
> message that it already sent. In short, from the client's point of
> view, if it enters COPY IN or COPY BOTH mode via the extend query
> protocol, and an error occurs on the server, the server MAY OR MAY NOT
> expect a further Sync message before issuing ReadyForQuery, and the
> client has no way of knowing -- except maybe waiting for a while to
> see what happens.
>
> It does not appear to me that there is any good solution to this
> problem. Fixing it on the server side would require a wire protocol
> change - e.g. one kind of Sync message that is used in a
> Parse-Bind-Describe-Execute-Sync sequence that only terminates
> non-COPY commands and another kind that is used to signal the end even
> of COPY. Fixing it on the client side would require all clients to
> know prior to initiating an extended-query-protocol sequence whether
> or not the command was going to initiate COPY, which is an awful API
> even if didn't constitute an impossible-to-contemplate backward
> compatibility break. Perhaps we will have to be content to document
> the fact that this part of the protocol is depressingly broken...
>
> ...unless of course somebody can see something that I'm missing here
> and the situation isn't as bad as it currently appears to me to be.

Anybody have any thoughts on this?

I've been thinking on it a bit, but don't really have anything that can be
done without a protocol version bump.

We can't really disallow extended query protocol COPY, too much is likely
to break. And we can't fix it without a protocol change.

A warning in the docs for COPY would be appropriate, noting that clients
should use the simple query protocol to issue COPY. It's kind of mixing
layers, since many users won't see the protocol level or have any idea if
their client driver uses ext or simple query, but we can at least advise
libpq users.

Also in the protocol docs, noting that clirnfa sending COPY should prefer
the simple query protocol due to error recovery issues with COPY and
extended query protocol.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2017-02-13 21:43:56 Re: Small improvement to parallel query docs
Previous Message Corey Huinker 2017-02-13 21:12:55 Re: \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)