From: | DHSC Webmaster <webmaster(at)dhs-club(dot)com> |
---|---|
To: | "Zhang, Anna" <azhang(at)verisign(dot)com> |
Cc: | "'pgsql-admin(at)postgresql(dot)org'" <pgsql-admin(at)postgresql(dot)org> |
Subject: | Re: copy command |
Date: | 2002-01-03 19:11:33 |
Message-ID: | 3C34ACE5.3D201E9@dhs-club.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
Isabelle Brette wrote:
>
> Hello,
>
> On Thu, Jan 03, 2002 at 12:10:01PM -0500, Zhang, Anna wrote:
> > I have a problem to load data to postgres database using copy command. The
> > problem is that we have one column called address which is multi-line text,
> > the taxt file looks like this:
> >
> > aab770|awkc.com administration|sultan 23
> > Bogota, na00000|CO
> >
> > above shows one record with '|' as delimiters. Column 3 has two lines which
> > caused the record is splited into two lines. If I use copy command to load
> > this record, it is always treated as two records. Is there any way to tell
> > postgres to read until reach a specified char? not stop at '\n'?
> > Or maybe edit the text file to make it one line, but how? We have milions
> > of such records in the text file.
>
> You can use some powerful editor, such as VI, to recognize any line that
> does not begin with a proper pattern, and join it with the one before.
> Not to lose the line feed you'll probably have to add something to
> replace it (such as, let's say, a tab) and convert it back once you've
> imported (postgres has some nice pattern-matching features).
>
> I've done this before to prepare an import (through a perl script,
> because I needed the information previously in one table to be split
> within several ones, but this does not make a major difference, because
> a perl script only reads one line at a time), the condition is that the
> beginning of each line is easily recognizable. If your first column is
> _not_ easily recognizable, maybe you can find a pattern showing a line
> is incomplete and join with the next line (bad number of |, for
> example).
>
> Hope I've been clear enough (I'm not that good at explaining things in
> English) and that it helps.
>
> --
> Isabelle Brette - isabelle(at)apartia(dot)fr
>
If you use a perl script to 'count' your columns, you can escape your
newlines with a \ in your addresses and postgres will transparently
convert them to newlines in the data. By doing this it will process more
than one line in your text file as a row.
network=# create table tmp2 (id int, name text, address text);
CREATE
network=# insert into tmp2 (id,name,address) values(100,'bill','456
harmony ln.
network'# hickup, NH');
INSERT 86195110 1
network=# select * from tmp2;
id | name | address
-----+------+----------------------------
100 | bill | 456 harmony ln.
hickup, NH
(1 row)
Insert another row.
network=# insert into tmp2 (id,name,address)values(200,'Sam','Is
linebreak here?\newline');
INSERT 86195203 1
network=# select * from tmp2;
id | name | address
-----+------+-----------------------------
100 | bill | 456 harmony ln.
hickup, NH
200 | Sam | Is linebreak here?
ewline
(3 rows)
Here is the export file I copied out from tmp2
100|bill|456 harmony ln.\
hickup, NH
200|Sam|Is linebreak here?\
ewline
This file imported correctly to another table with the newlines embedded
in the data.
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
--
Bill MacArthur
Webmaster
DHS Club
From | Date | Subject | |
---|---|---|---|
Next Message | yves.guerin | 2002-01-04 19:53:47 | libcrypt problem |
Previous Message | Isabelle Brette | 2002-01-03 18:16:39 | Re: copy command |