Re: importing db as text files

From: Jason Godden <jasongodden(at)optushome(dot)com(dot)au>
To: expect <expect(at)ihubbell(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: importing db as text files
Date: 2003-08-13 21:34:55
Message-ID: 200308140734.56108.jasongodden@optushome.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi expect,

Best way in my opinion is to use the copy table command. This way Pg will
actually 'massage' the data (string escapes and all) for you.

If you have complex files best to use a combo of sed/awk to pre-massage the
field delimiters and import the data to a temporary table. In most instances
it is best to pipe the data to the psql command using copy table from stdin
as from file requires that you are the process owner.

Always use a temporary import table and perform validation/further conversion
in that table (IMO).

eg (from one of my earlier posts)

cat unprocesseddata.txt | sed "s/\",\"/$TAB/g" | sed "s/\"//g" | sed "s/.$//g"
| psql -dmydatabase -c "copy importtable from stdin delimiter '\t';"

The first sed will replace all instances of "," with a tab character
(I achieve tab in my shell scripts with a line like TAB=$(printf "\t"))

The second sed will remove the start and end " because the first only checks
for ",".

The third sed will remove the Windows carriage returns (if any - depends on
the platform the file was generated from.

Note here that my file format never contains any " in a field so I can safely
run the second sed. If your import file contains these then you will need to
change the regex or use awk.

Regards,

Jason

On Thu, 14 Aug 2003 07:14 am, expect wrote:
> What's the big deal with importing text files? I have a 70 MB file to
> import and it's been one problem after another. I used the copy command
> and it appears that it's just not possible. I finally massaged the file
> into a .sql file and ran that using \i db.sql but that failed too because I
> overlooked ' in names like D'Adario. The other problem I encountered was
> that a numeric field had to have data in it, pg would not default to the
> default value. So instead of massaging all the data again I decided to
> change the data type for that column. This is my first experience with
> postgresql and I'm wondering if I should expect to encounter similar pain
> as I go further into this? So far it's been very painful trying to do what
> I thought would be easy and what I think should be easy.
>
> PostgreSQL 7.3.4 on linux redhat 9
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Dennis Gearon 2003-08-13 21:43:08 Re: Sorting Problem
Previous Message Bruno Wolff III 2003-08-13 21:30:04 Re: importing db as text files