From: | Scott Marlowe <smarlowe(at)g2switchworks(dot)com> |
---|---|
To: | Ron Johnson <ron(dot)l(dot)johnson(at)cox(dot)net> |
Cc: | pgsql general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Newbie question about importing text files... |
Date: | 2006-10-11 20:29:58 |
Message-ID: | 1160598598.6181.50.camel@state.g2switchworks.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Tue, 2006-10-10 at 04:16, Ron Johnson wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 10/09/06 22:43, Jonathan Greenberg wrote:
> > So I've been looking at the documentation for COPY, and I'm curious about a
> > number of features which do not appear to be included, and whether these
> > functions are found someplace else:
> >
> > 1) How do I skip an arbitrary # of "header" lines (e.g. > 1 header line) to
> > begin reading in data?
Using something like bash, you can do this:
tail -n $(( `wc -l bookability-pg.sql|grep -oP "[0-9]+"` -2 ))
bookability-pg.sql|wc -l
make it an alias and call it skip and have it take an argument:
Put this in .bashrc and run the .bashrc file ( . ~/.bashrc ):
skipper(){
tail -n $(( `wc -l $1|grep -oP "[0-9]+"` -$2 )) $1
}
> > 2) Is it possible to screen out lines which begin with a comment character
> > (common outputs for csv/txt files from various programs)?
grep -vP "^#" filename
will remove all lines that start with #. grep is your friend in unix.
If you don't have unix, get cygwin as recommended elsewhere.
> > 3) Is there a way to read in fixed width files?
If you don't mind playing about with sed, you could use it and bash
scripting to do it. I have before. It's ugly looking but easy enough
to do. But I'd recommend a beginner use a scripting language they like,
one of the ones that starts with p is usually a good choice (perl,
python, php, ruby (wait, that's not a p!) etc...)
>
> Both Python & Perl have CSV parsing modules, and can of course deal
> with fixed-width data, let you skip comments, commit every N rows,
> skip over committed records in can the load crashes, etc, etc, etc.
php has a fgetcsv() built in as well. It breaks down csv into an array
and is really easy to work with.
From | Date | Subject | |
---|---|---|---|
Next Message | Jonathan Vanasco | 2006-10-11 20:53:53 | question on renaming a foreign key |
Previous Message | Tom Lane | 2006-10-11 20:29:26 | Re: invalid data in PID file |