Re: copy command

From: DHSC Webmaster <webmaster(at)dhs-club(dot)com>
To: "Zhang, Anna" <azhang(at)verisign(dot)com>
Cc: "'pgsql-admin(at)postgresql(dot)org'" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: copy command
Date: 2002-01-03 19:11:33
Message-ID: 3C34ACE5.3D201E9@dhs-club.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Isabelle Brette wrote:
>
> Hello,
>
> On Thu, Jan 03, 2002 at 12:10:01PM -0500, Zhang, Anna wrote:
> > I have a problem to load data to postgres database using copy command. The
> > problem is that we have one column called address which is multi-line text,
> > the taxt file looks like this:
> >
> > aab770|awkc.com administration|sultan 23
> > Bogota, na00000|CO
> >
> > above shows one record with '|' as delimiters. Column 3 has two lines which
> > caused the record is splited into two lines. If I use copy command to load
> > this record, it is always treated as two records. Is there any way to tell
> > postgres to read until reach a specified char? not stop at '\n'?
> > Or maybe edit the text file to make it one line, but how? We have milions
> > of such records in the text file.
>
> You can use some powerful editor, such as VI, to recognize any line that
> does not begin with a proper pattern, and join it with the one before.
> Not to lose the line feed you'll probably have to add something to
> replace it (such as, let's say, a tab) and convert it back once you've
> imported (postgres has some nice pattern-matching features).
>
> I've done this before to prepare an import (through a perl script,
> because I needed the information previously in one table to be split
> within several ones, but this does not make a major difference, because
> a perl script only reads one line at a time), the condition is that the
> beginning of each line is easily recognizable. If your first column is
> _not_ easily recognizable, maybe you can find a pattern showing a line
> is incomplete and join with the next line (bad number of |, for
> example).
>
> Hope I've been clear enough (I'm not that good at explaining things in
> English) and that it helps.
>
> --
> Isabelle Brette - isabelle(at)apartia(dot)fr
>

If you use a perl script to 'count' your columns, you can escape your
newlines with a \ in your addresses and postgres will transparently
convert them to newlines in the data. By doing this it will process more
than one line in your text file as a row.

network=# create table tmp2 (id int, name text, address text);
CREATE
network=# insert into tmp2 (id,name,address) values(100,'bill','456
harmony ln.
network'# hickup, NH');
INSERT 86195110 1
network=# select * from tmp2;
id | name | address
-----+------+----------------------------
100 | bill | 456 harmony ln.
hickup, NH
(1 row)

Insert another row.

network=# insert into tmp2 (id,name,address)values(200,'Sam','Is
linebreak here?\newline');
INSERT 86195203 1
network=# select * from tmp2;
id | name | address
-----+------+-----------------------------
100 | bill | 456 harmony ln.
hickup, NH
200 | Sam | Is linebreak here?
ewline
(3 rows)

Here is the export file I copied out from tmp2
100|bill|456 harmony ln.\
hickup, NH
200|Sam|Is linebreak here?\
ewline

This file imported correctly to another table with the newlines embedded
in the data.

> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html

--
Bill MacArthur
Webmaster
DHS Club

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message yves.guerin 2002-01-04 19:53:47 libcrypt problem
Previous Message Isabelle Brette 2002-01-03 18:16:39 Re: copy command