Re: autocommit (true/false) for more than 1 million records

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Alex Goncharov <alex(dot)goncharov(dot)usa(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: autocommit (true/false) for more than 1 million records
Date: 2014-08-27 12:59:29
Message-ID: 1409144369.45962.YahooMailNeo@web122306.mail.ne1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Alex Goncharov <alex(dot)goncharov(dot)usa(at)gmail(dot)com> wrote:
> Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:

>> The rows will all be in the table, but not visible to any other
>> transaction.
>
> How much data can I fit there while doing COPY?  Not 1 TB?

As has already been said, why not?  This is not some special
section of the table -- the data is written to the table.  Period.
Commit or rollback just tells new transactions whether data flagged
with that transaction number is visible.

Nobody can tell you how much space that will take -- it depends on
many factors, including how many columns of what kind of data, how
compressible it is, and how it is indexed.  But the point is, we
are not talking about any separate space from what is needed to
store the data in the database.

FWIW, I think the largest single COPY statement I ever ran was
generated by pg_dump and piped directly to psql for a major release
upgrade (before pg_upgrade was available), and it was somewhere in
the 2TB to 3TB range.  It took a long time, but it "just worked".
That should be true for 10TB or 100TB, as long as you have sized
the machine correctly and are loading clean data.  Whether you have
that covered, and how you want to "hedge your bets" based on your
degree of confidence in those things is a judgment call.  When I'm
in the position of needing to make such a call, I like to do some
tests.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message gmb 2014-08-28 08:50:40 Performance issue: index not used on GROUP BY...
Previous Message Albe Laurenz 2014-08-27 12:41:50 Re: autocommit (true/false) for more than 1 million records