Re: COPY FROM STDIN behaviour on end-of-file

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Vaishnavi Prabakaran <vaishnaviprabakaran(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY FROM STDIN behaviour on end-of-file
Date: 2017-05-17 05:34:54
Message-ID: CAEepm=3OT7fxbpvtJAX0TkK=+01Vbm8tCsTff4=3i0q0e-ERFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 17, 2017 at 2:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Vaishnavi Prabakaran <vaishnaviprabakaran(at)gmail(dot)com> writes:
>>> Tom Lane wrote:
>>>> BTW, it would be a good idea for somebody to check this out on Windows,
>>>> assuming there's a way to generate a keyboard EOF signal there.
>
>> Ctrl-Z + Enter in windows generates EOF signal. I verified this issue and
>> it is not reproducible in windows.
>
> Thanks for checking. So that's two major platforms where it works "as
> expected" already.

Ah... the reason this is happening is that BSD-derived fread()
implementations return immediately if the EOF flag is set[1], but
others do not. At a guess, not doing that is probably more conformant
with POSIX ("... less than nitems only if a read error or end-of-file
is *encountered*", which seems to refer to the underlying condition
and not the user-clearable EOF flag).

We are neither clearing nor checking the EOF flag explicitly, and that
only works out OK on fread implementation that also ignore it.

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote (further upstream):
> If we're going
> to go out of our way to make it work, should we mention it in psql-ref?

I went looking for the place to put that and found that it already says:

For <literal>\copy ... from stdin</>, data rows are read from the same
source that issued the command, continuing until <literal>\.</literal>
is read or the stream reaches <acronym>EOF</>.

That's probably referring to the "outer" stream, such as a file that
contains the COPY ... FROM STDIN command, but doesn't it seem like a
general enough statement to cover ^d in the interactive case too?

Here's a version incorporating your other suggestions and a comment to explain.

[1] https://github.com/freebsd/freebsd/blob/afbef1895e627cd1993428a252d39b505cf6c085/lib/libc/stdio/refill.c#L79

--
Thomas Munro
http://www.enterprisedb.com

Attachment Content-Type Size
clear-copy-stream-eof-v2.patch application/octet-stream 1.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2017-05-17 05:41:41 Re: [POC] hash partitioning
Previous Message Masahiko Sawada 2017-05-17 05:26:51 Fix refresh_option syntax of ALTER SUBSCRIPTION in document