Re: Thoughts on how to avoid a massive integer update.

From: "Fehrle, Brian" <bfehrle(at)comscore(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Thoughts on how to avoid a massive integer update.
Date: 2020-05-08 19:35:13
Message-ID: E6F1ABF5-C4A7-4672-A304-02A1C565711F@comscore.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

From: "David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Date: Friday, May 8, 2020 at 11:48 AM
To: "Fehrle, Brian" <bfehrle(at)comscore(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Thoughts on how to avoid a massive integer update.

[External Email]
On Mon, May 4, 2020 at 2:33 PM Fehrle, Brian <bfehrle(at)comscore(dot)com<mailto:bfehrle(at)comscore(dot)com>> wrote:
I NEED to update every single row in all these tables, changing the integer value to a different integer.

Does anyone have any hackery ideas on how to achieve this in less time?

Probably the only solution that would perform computationally faster would take the same amount of time or more to code and debug, and be considerably riskier. Basically shut down PostgreSQL and modify the data files directly to change one integer byte sequence to another. On the positive side the source code for PostgreSQL is open source and that data, while complex, is structured.

On the point of "vacuum" versus "vacuum full" - I don't know if this is how it would work in reality but conceptually if you updated half the table, vacuumed, updated the second half, vacuumed, the second batch of updates would reuse the spaced freed from the first batch and you'd only increase the disk consumption by 1.5 instead of 2.0. As you increase the number of batches the percentage of additional space consumed decreases. Though if you have the space I'd have to imagine that creating a brand new table and dropping the old one would be the best solution when taken in isolation.

David J.

Modifying data files is too risky and I wouldn’t be able to get that kind of work approved.

Even with keeping excess space to an additional 50%, that’s tons of storage I’d need to order, so either vacuum full or re-create tables for minimal on disk usage are my only options.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Fehrle, Brian 2020-05-08 19:36:18 Re: Thoughts on how to avoid a massive integer update.
Previous Message Adrian Klaver 2020-05-08 19:34:10 Re: Odd postgres12 upgrade is changing or munging a password?