Re: COPY IN/BOTH vs. extended query mode

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY IN/BOTH vs. extended query mode
Date: 2017-02-13 17:14:20
Message-ID: CA+TgmoahKj-7WP2Ax3pHqSfwdVJn_EKvH5=+bsUtS7AOFC43vg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 23, 2017 at 9:12 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> According to the documentation for COPY IN mode, "If the COPY command
> was issued via an extended-query message, the backend will now discard
> frontend messages until a Sync message is received, then it will issue
> ReadyForQuery and return to normal processing." I added a similar
> note to the documentation for COPY BOTH mode in
> 91fa8532f4053468acc08534a6aac516ccde47b7, and the documentation
> accurately describes the behavior of the server. However, this seems
> to make fully correct error handling for clients using libpq almost
> impossible, because PQsendQueryGuts() sends
> Parse-Bind-Describe-Execute-Sync in one shot without regard to whether
> the command that was just sent invoked COPY mode (cf. the note in
> CopyGetData about why we ignore Flush and Sync in that function).
>
> So imagine that the client uses libpq to send (via the extended query
> protocol) a COPY IN command (or some hypothetical command that starts
> COPY BOTH mode to begin). If the server throws an error before the
> Sync message is consumed, it will bounce back to PostgresMain which
> will set doing_extended_query_message = true after which it will
> consume messages, find the Sync, reset that flag, and send
> ReadyForQuery. On the other hand, if the server enters CopyBoth mode,
> consumes the Sync message in CopyGetData (or a similar function), and
> *then* throws an ERROR, the server will wait for a second Sync message
> from the client before issuing ReadyForQuery. There is no sensible
> way of coping with this problem in libpq, because there is no way for
> the client to know which part of the server code consumed the Sync
> message that it already sent. In short, from the client's point of
> view, if it enters COPY IN or COPY BOTH mode via the extend query
> protocol, and an error occurs on the server, the server MAY OR MAY NOT
> expect a further Sync message before issuing ReadyForQuery, and the
> client has no way of knowing -- except maybe waiting for a while to
> see what happens.
>
> It does not appear to me that there is any good solution to this
> problem. Fixing it on the server side would require a wire protocol
> change - e.g. one kind of Sync message that is used in a
> Parse-Bind-Describe-Execute-Sync sequence that only terminates
> non-COPY commands and another kind that is used to signal the end even
> of COPY. Fixing it on the client side would require all clients to
> know prior to initiating an extended-query-protocol sequence whether
> or not the command was going to initiate COPY, which is an awful API
> even if didn't constitute an impossible-to-contemplate backward
> compatibility break. Perhaps we will have to be content to document
> the fact that this part of the protocol is depressingly broken...
>
> ...unless of course somebody can see something that I'm missing here
> and the situation isn't as bad as it currently appears to me to be.

Anybody have any thoughts on this?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2017-02-13 17:31:53 Re: [COMMITTERS] pgsql: Remove all references to "xlog" from SQL-callable functions in p
Previous Message Robert Haas 2017-02-13 17:12:35 Re: log_autovacuum_min_duration doesn't log VACUUMs