copy losing information

From: "Silvela, Jaime \(Exchange\)" <JSilvela(at)Bear(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: copy losing information
Date: 2006-07-26 16:48:37
Message-ID: 6D6734D7CD866145AE87A2D5D88830A90228290B@whexchmb14.bsna.bsroot.bear.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

This is the first time I post to the list. I've done a brief search and
didn't find my issue treated already, so here it goes. Apologies if this
has been reported before.

I have a pretty big file, around 2 million rows, in tab-separated
format, with 4 columns, that I read into a table in Postgres using the
copy command.

I've started to notice missing info sometimes. I'll truncate the table,
read from the file, and notice that sometimes there are less rows in the
table than in the file.

This is not well reproducible. If I truncate again, and reread, I may
get all the lines, or I may get a different amount of missing lines.

I concluded that there was a bug in the copy command, and wrote a
replacement in Ruby, using the pure-ruby Postgres-pr library.

I run into the same issue. Some lines seem to be dropped, but no
exceptions nor SQL errors are reported by the program.

In order to improve throughput, in my ruby program I connect to the
server just once, and send the INSERT statements to the server in
batches of 2000.

I've checked that the file doesn't contain any SQL escape sequences or
anything else that would invalidate an INSERT.

The version running in the server is 8.1.3 on Linux 2.6.5 on an Intel
platform.

The imports are being run from windows machines in the same network.

Has somebody seen this before?

Thanks

Jaime

Attachment Content-Type Size
Disclaimer.txt text/plain 965 bytes

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Michael Fuhr 2006-07-26 17:03:57 Re: wrong timestamp
Previous Message Reece Hart 2006-07-26 15:48:14 Re: Mapping/DB Migration tool