Re: Libpq: copy file to bytea column

From: seiliki(at)so-net(dot)net(dot)tw
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Libpq: copy file to bytea column
Date: 2010-03-08 12:52:30
Message-ID: 20100308125237.2E852F48170@m5.so-net.net.tw
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> > The data types of tableout.c1 and tablein.c1 are both bytea.
> > I first export tableout.c1 to a file:
> >
> > db1=# COPY (SELECT c1 FROM tableout LIMIT 1) TO '/tmp/t';
> >
> > Then I try to import the file to another table.
> >
> > This works without flaw:
> >
> > db1=# COPY tablein FROM '/tmp/t';
> >
> > However, I get the following errors from log when calling
> > libpq functions PQputCopyData() and PQputCopyEnd().
> >
> > 2010-03-06 20:47:42 CST ERROR: invalid byte sequence for
> > encoding "UTF8": 0x00
> > 2010-03-06 20:47:42 CST HINT: This error can also happen if
> > the byte sequence does not match the encoding expected by the
> > server, which is controlled by "client_encoding".
> > 2010-03-06 20:47:42 CST CONTEXT: COPY in, line 1:
> > "TPF0\011Report\000\003Tag\002\365\016ExportFromPage\002\000\014ExportToPage\002\000\006Values\016..."
> > 2010-03-06 20:47:42 CST STATEMENT: COPY in FROM STDIN
>
> The difference here is that your first examples are COPY TO/FROM
> files on the database server, while your failing C code uses
> COPY FROP STDIN.
>
> You will not meet encoding problems with the first because
> the file /tmp/t is created in the server encoding, so if you read
> it later with a COPY FROM, it will automatically be in the encoding
> that COPY FROM <file> expects.
>
> When you use COPY FROM STDIN, you send data from the client to the
> server, which is something quite different. Data sent from the client
> are expected to be in the client encoding, not in the server encoding.
>
> You can use the SQL statement "SET client_encoding = ..." to set the
> client encoding to the encoding of your data, or you can use the
> PGCLIENTENCODING environment variable to set it automatically.
>
> The error message you are getting indicates that you client_encoding
> is currently UTF8.
> By the way, the line quoted in your error message does not look like
> valid text at all - it contains "zero" bytes, and preceeding each
> string there seems to be a byte indicating the length of the string
> that follows.

Changing client encoding can not make PQputCopyData() work, either. The problem seems to be caused by the fact that PQputCopyData() does not accept zero value ('\0' in C), which I think can not be encoded to any character regardless of which client encoding is used.

Yes, the data I am trying to import is binary data. That is why I use bytea to store it.

What I can't understand is why PQputCopyData() encodes incoming character string, which was exported by "COPY ... TO '/tmp/t'" SQL command under psql prompt, for bytea columns and refuse to accept or misinterpret zero value.

Below is the more or less self contained code that also yields the same problem.
----------------------------
#define BYTEA "abcd\\011\\000ABCDEFGHIJKLMNO"

int main()
{
PGconn *PgConnect;
PGresult *result;
PgConnect=PQconndb("host=localhost port=5432 dbname=db1 user=postgres password=passwd");
if(!PgConnect)
puts("Fail to connect.");
else{
result=PQexec(PgConnect,"COPY tablein FROM STDIN");
if(IsPgsqlExecutionOk(NULL,result)){
if(PQputCopyData(PgConnect,BYTEA,20) == 1){
if(PQputCopyEnd(PgConnect,NULL) == 1)
puts("Done");
else
puts(PQerrorMessage(PgConnect));
}
else
puts(PQerrorMessage(PgConnect));
}
PQfinish(PgConnect);
}
}
------Errors from the code:---------------
ERROR: invalid byte sequence for encoding "UTF8": 0x00
HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".
CONTEXT: COPY tablein, line 1: "abcd\011\000ABCDEFGH"
STATEMENT: COPY tablein FROM STDIN
----------------------------

Another issue I can not understand is that both PQputCopyData() and PQputCopyEnd() always return 1 even though the copy operation actually fails. That being said, the above code always shows "Done" but actually error occurs.

Regards,
CN

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message seiliki 2010-03-08 13:51:52 Re: Libpq: copy file to bytea column
Previous Message Chris Roffler 2010-03-08 12:50:11 Re: XML Index again