Re: importing db as text files

From: expect <expect(at)ihubbell(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: importing db as text files
Date: 2003-08-14 03:52:34
Message-ID: 20030813205234.0420bb9f.expect@ihubbell.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, 14 Aug 2003 07:34:55 +1000
Jason Godden <jasongodden(at)optushome(dot)com(dot)au> wrote:

> Hi expect,
>
> Best way in my opinion is to use the copy table command. This way Pg will
> actually 'massage' the data (string escapes and all) for you.

I guess we're of the same opinion. I did use the copy table command.
I believe the problem is that pg is unable to use the default value when a
value is not present.

>
> If you have complex files best to use a combo of sed/awk to pre-massage the
> field delimiters and import the data to a temporary table. In most instances
> it is best to pipe the data to the psql command using copy table from stdin
> as from file requires that you are the process owner.
>
> Always use a temporary import table and perform validation/further conversion
> in that table (IMO).
>
> eg (from one of my earlier posts)
>
> cat unprocesseddata.txt | sed "s/\",\"/$TAB/g" | sed "s/\"//g" | sed "s/.$//g"
> | psql -dmydatabase -c "copy importtable from stdin delimiter '\t';"
>
> The first sed will replace all instances of "," with a tab character
> (I achieve tab in my shell scripts with a line like TAB=$(printf "\t"))
>
> The second sed will remove the start and end " because the first only checks
> for ",".
>
> The third sed will remove the Windows carriage returns (if any - depends on
> the platform the file was generated from.
>
> Note here that my file format never contains any " in a field so I can safely
> run the second sed. If your import file contains these then you will need to
> change the regex or use awk.
>
> Regards,
>
> Jason
>
> On Thu, 14 Aug 2003 07:14 am, expect wrote:
> > What's the big deal with importing text files? I have a 70 MB file to
> > import and it's been one problem after another. I used the copy command
> > and it appears that it's just not possible. I finally massaged the file
> > into a .sql file and ran that using \i db.sql but that failed too because I
> > overlooked ' in names like D'Adario. The other problem I encountered was
> > that a numeric field had to have data in it, pg would not default to the
> > default value. So instead of massaging all the data again I decided to
> > change the data type for that column. This is my first experience with
> > postgresql and I'm wondering if I should expect to encounter similar pain
> > as I go further into this? So far it's been very painful trying to do what
> > I thought would be easy and what I think should be easy.
> >
> > PostgreSQL 7.3.4 on linux redhat 9
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 7: don't forget to increase your free space map settings
>
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message expect 2003-08-14 04:14:17 Re: importing db as text files
Previous Message expect 2003-08-14 03:50:06 Re: importing db as text files