From: | Reece Hart <hart(dot)reece(at)gene(dot)com> |
---|---|
To: | "Silvela, Jaime (Exchange)" <JSilvela(at)Bear(dot)com> |
Cc: | pgsql-general <pgsql-general(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: copy losing information |
Date: | 2006-07-28 21:33:36 |
Message-ID: | 1154122416.10397.165.camel@tallac.gene.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Wed, 2006-07-26 at 13:38 -0700, Reece Hart wrote:
> If you can release the data and get it to me (e.g., compressed email
> attachment, http, ftp), I volunteer to try a run on 8.1.4 on suse
> 10.0/x86_64.
Jaime-
I cannot replicate your observations.
There were 1915733 lines in the file you sent. In all tests below,
select(*) returned this number.
- linux/x86 client: within psql shell, create table, \copy, select
count(*), truncate, \copy, select count(*). OK.
- linux/x86 client: psql -dcsb-dev -c '\copy
haver_msamo_raw(series,ticker,date,value) from test.out', then
select(*). OK
- win XP client: command line copy as you had posted, akin to above.
Twice. OK
- linux/x86 client: truncate-copy-select(*) 35 times. all OK
All tests were with 8.1.4 server and clients. server_encoding and
client_encoding are SQL_ASCII according to pg_settings. On windows, I
got some error about code page mismatch at the console, but I doubt
that's relevant.
Am at a loss to explain what you see. I don't know enough about
character encoding to know whether that might be a culprit. Since you
apparently have a linux box (from your version() string), have you tried
loading from the server box?
Good luck,
Reece
Some details:
tallac$ gzip -t test.out.gz
test.out.gz: OK
tallac$ gzip -cd test.out.gz | wc -l
test.out.gz: 88.0%
1915733
tallac$ md5sum test.out.gz
929582602507880045c1795970c974e4 test.out.gz
And in PostgreSQL:
rkh(at)csb-dev=> select version();
version
------------------------------------------------------------------------------------------------------------------
PostgreSQL 8.1.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.0.2 20050901 (prerelease) (SUSE Linux)
(1 row)
rkh(at)csb-dev=> TRUNCATE haver_msamo_raw ;
TRUNCATE TABLE
rkh(at)csb-dev=> \copy haver_msamo_raw(series,ticker,date,value) from copy/test.out
\.
rkh(at)csb-dev=> select count(*) from haver_msamo_raw ;
count
---------
1915733
(1 row)
Linux command line:
$ psql -dcsb-dev -c '\copy haver_msamo_raw(series,ticker,date,value) from test.out'
Windows XP command line (psql from postgresql org 8.1.4 binaries):
psql.exe -hcsb -dcsb-dev -Upostgres -c "\copy haver_msamo_raw (series,ticker,date,value) from C:\temp\test.out"
Then I did this 35 times (in a shell script):
truncate rkh.haver_msamo_raw;
select 'before',count(*) from rkh.haver_msamo_raw;
\copy rkh.haver_msamo_raw(series,ticker,date,value) from test.out
select 'after',count(*) from rkh.haver_msamo_raw;
--
Reece Hart, Ph.D. rkh(at)gene(dot)com, http://www.gene.com/
Genentech, Inc. 650-225-6133 (voice), -5389 (fax)
Bioinformatics and Protein Engineering
1 DNA Way, MS-93 http://harts.net/reece/
South San Francisco, CA 94080-4990 reece(at)harts(dot)net, GPG:0x25EC91A0
From | Date | Subject | |
---|---|---|---|
Next Message | gene Campbell | 2006-07-28 22:47:26 | Secure connections with changing ip addresses? |
Previous Message | Tom Lane | 2006-07-28 21:17:54 | Re: What's special about 1916-10-01 02:25:20? Odd jump in internal timestamptz representation |