Re: [HACKERS] Bulk update of large database

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martin Weinberg <weinberg(at)osprey(dot)astro(dot)umass(dot)edu>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] Bulk update of large database
Date: 1999-11-20 17:32:16
Message-ID: 7582.943119136@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Martin Weinberg <weinberg(at)osprey(dot)astro(dot)umass(dot)edu> writes:
> This does work but appends the updates (until the next vacuum).
> For a 100GB database, this is too large of a storage overhead.
> Is there another good way?

There is no alternative; any sort of update operation will write a
new tuple value without first deleting the old. This must be so
to preserve transaction semantics: if an error occurs later on
during the update (eg, violation of a unique-index constraint) the
old tuple value must still be there.

The only answer I can see is to update however many tuples you can
spare the space for, commit the transaction, vacuum, repeat.

The need for repeated vacuums in this scenario is pretty annoying.
It'd be nice if we could recycle dead tuples without a full vacuum.
Offhand I don't see any way to do it without introducing performance
penalties elsewhere...

regards, tom lane

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 1999-11-20 21:44:52 Re: [HACKERS] pg_dump bug
Previous Message Andrij Korud 1999-11-20 17:25:42 C++ and SPI