problems with copy from file

From: Andreas Kretschmer <akretschmer(at)spamfence(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: problems with copy from file
Date: 2015-10-14 16:40:30
Message-ID: 20151014164030.GA15223@tux
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

hello @ll,

i have a simpe table :

test=# create table httpd_log(id serial primary key, data text);
CREATE TABLE

and i'm trying to import from a httpd-log, that contains this:

kretschmer(at)tux:~$ cat test.log
...
domain.de aaa.63.xx.yy - - [06/Nov/2014:00:48:22 +0100] "GET /index.php/impressum2/year.listevents/2015/01/08/101 HTTP/1.0" 200 28076 "-" "Mozilla/5.0 (compatible; MJ12bot/v1.4.5; http://www.majestic12.co.uk/bot.php?+)" "-"
other-domain.de bb.243.xx.yyy - - [06/Nov/2014:00:48:22 +0100] "\x16\x03\x01\x01\xb1\x01" 501 1037 "-" "-" "-"
domain.tld cc.249.xx.yy - - [06/Nov/2014:00:48:22 +0100] "GET /leipzig/transport?start=Uranusstr.&ziel=Finkengrund HTTP/1.1" 200 5610 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "-"
...

Url's and IP's changed by me. As you can see: simple ASCII-Text

test=*# \copy httpd_log (data) from '~/test.log';
ERROR: invalid byte sequence for encoding "UTF8": 0xb1
CONTEXT: COPY httpd_log, line 3: "other-domain bb.243.xx.yyy - - [06/Nov/2014:00:48:22 +0100] "\x16\x03\x01\x01\xb1\x01" 501 10..."

I have a solution:

test=*# \copy apache_log (data) from program 'sed -e "s/\\x/\\\\x/g" test.log';
COPY 5

The table contains the corrent content:

...
other-domain.de bb.243.xx.yyy - - [06/Nov/2014:00:48:22 +0100] "\x16\x03\x01\x01\xb1\x01" 501 1037 "-" "-" "-"
...

But is there a way to COPY the file without external tools, in this case sed?

Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect. (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly." (unknown)
Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jonathan Vanasco 2015-10-14 19:16:18 Re: using postgresql for session
Previous Message Merlin Moncure 2015-10-14 16:31:11 Re: Understanding "seq scans"