From: | Emi Lu <emilu(at)encs(dot)concordia(dot)ca> |
---|---|
To: | Kevin Grittner <kgrittn(at)ymail(dot)com>, Alex Goncharov <alex(dot)goncharov(dot)usa(at)gmail(dot)com> |
Cc: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: autocommit (true/false) for more than 1 million records |
Date: | 2014-08-28 14:49:08 |
Message-ID: | 53FF4164.9070102@encs.concordia.ca |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hello All,
I learned a lot by inputs from all of you. To share one more thing about
java_JDBC bypassing autocommit that I tried:
(1) Read/save source data into f1.csv, f2.csv, ......
(2) Copy/load into dest psql.DB
CopyManager cm = null;
FileReader fileReader = null;
cm = new CopyManager((BaseConnection) conn_psql);
fileReader = new FileReader("f1.csv");
cm.copyIn("COPY table_name FROM STDIN WITH DELIMITER '|'",
fileReader);
fileReader.close();
Emi
On 08/27/2014 08:59 AM, Kevin Grittner wrote:
> Alex Goncharov <alex(dot)goncharov(dot)usa(at)gmail(dot)com> wrote:
>> Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
>>> The rows will all be in the table, but not visible to any other
>>> transaction.
>> How much data can I fit there while doing COPY? Not 1 TB?
> As has already been said, why not? This is not some special
> section of the table -- the data is written to the table. Period.
> Commit or rollback just tells new transactions whether data flagged
> with that transaction number is visible.
>
> Nobody can tell you how much space that will take -- it depends on
> many factors, including how many columns of what kind of data, how
> compressible it is, and how it is indexed. But the point is, we
> are not talking about any separate space from what is needed to
> store the data in the database.
>
> FWIW, I think the largest single COPY statement I ever ran was
> generated by pg_dump and piped directly to psql for a major release
> upgrade (before pg_upgrade was available), and it was somewhere in
> the 2TB to 3TB range. It took a long time, but it "just worked".
> That should be true for 10TB or 100TB, as long as you have sized
> the machine correctly and are loading clean data. Whether you have
> that covered, and how you want to "hedge your bets" based on your
> degree of confidence in those things is a judgment call. When I'm
> in the position of needing to make such a call, I like to do some
> tests.
>
> --
> Kevin Grittner
> EDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2014-08-28 20:48:05 | Re: Performance issue: index not used on GROUP BY... |
Previous Message | Victor Yegorov | 2014-08-28 12:25:56 | Re: Performance issue: index not used on GROUP BY... |