From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Daniel Verite <daniel(at)manitou-mail(dot)org>, hlinnaka <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavel Golub <pavel(at)microolap(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net> |
Subject: | Re: raw output from copy |
Date: | 2016-03-31 06:12:24 |
Message-ID: | CAFj8pRCUa8QMKmqbfVmsE5zDyH9rdSVL0i=Hku9+nRUVqsYK8w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi
2016-03-29 20:59 GMT+02:00 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> > I am writing few lines as summary:
>
> > 1. invention RAW_TEXT and RAW_BINARY
> > 2. for RAW_BINARY: PQbinaryTuples() returns 1 and PQfformat() returns 1
> > 3.a for RAW_TEXT: PQbinaryTuples() returns 0 and PQfformat() returns 0,
> but
> > the client should to check PQcopyFormat() to not print "\n" on the end
> > 3.b for RAW_TEXT: PQbinaryTuples() returns 1 and PQfformat() returns 1,
> but
> > used output function, not necessary client modification
> > 4. PQcopyFormat() returns 0 for text, 1 for binary, 2 for RAW_TEXT, 3 for
> > RAW_BINARY
> > 5. create tests for ecpg
>
> 3.b certainly seems completely wrong. PQfformat==1 would imply binary
> data.
>
> I suggest that PQcopyFormat should be understood as defining the format
> of the copy data encapsulation, not the individual fields. So it would go
> like 0 = traditional text format, 1 = traditional binary format, 2 = raw
> (no encapsulation). You'd need to also look at PQfformat to distinguish
> raw text from raw binary. But if we do it as you suggest above, we've
> locked ourselves into only ever having two field format codes, which
> is something the existing design is specifically intended to allow
> expansion in.
>
>
I wrote concept of raw_text, raw_binary modes.
I am trying to implement text data passing like text format - but for
RAW_TEXT it is not practical. Text passing is designed for one line data,
for multiline data enforces escaping, what we don't would for RAW mode. I
have to skip escaping, and the code is not nice.
So I propose different schema - RAW_TEXT uses text values (uses
input/output functions), enforce encoding from/to client codes and for
passing to client mode is used binary mode - then I don't need to read the
content with line by line. PQbinaryTuples() returns 1 for RAW_TEXT and
RAW_BINARY - in these cases data are passed as one binary value. PQfformat
returns 2 for RAW_TEXT and 3 for RAW_BINARY.
Any objections to this design?
Regards
Pavel
> regards, tom lane
>
Attachment | Content-Type | Size |
---|---|---|
copy-raw-format-20160331-04.patch | text/x-patch | 41.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Craig Ringer | 2016-03-31 06:34:51 | Re: raw output from copy |
Previous Message | Rajkumar Raghuwanshi | 2016-03-31 05:35:37 | Re: Postgres_fdw join pushdown - INNER - FULL OUTER join combination generating wrong result |