Re: UPDATE on 20 Million Records Transaction or not?

From: Ron <ronljohnsonjr(at)gmail(dot)com>
To: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: UPDATE on 20 Million Records Transaction or not?
Date: 2020-06-23 15:36:31
Message-ID: 023c409e-5313-735d-6c26-4c3a300310fd@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


I'd make a copy of the table, and test how long the various methods take.

On 6/23/20 10:17 AM, Jason Ralph wrote:
>
> Hello List,
>
> PostgreSQL 11.8 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7
> 20120313 (R
>
> ed Hat 4.4.7-23), 64-bit
>
> I am planning an update on a table with 20Million records, I have been
> researching the best practices.  I will remove all indexes and foreign
> keys prior to the update, however I am not sure if I should use a
> transaction or not.
>
> My thought process is that a transaction would be easier to recover if
> something fails, however it would take more time to write to the WAL log
> in a transaction.
>
>
> >Are you updating *every* row in the table?
>
> No I am using an update like so: UPDATE members SET regdate='2038-01-18'
> WHERE regdate='2020-07-07'
>
> DB=# select count(*) from members where regdate = '2020-07-07';
>
>   count
>
> ----------
>
> 17333090
>
> (1 row)
>
> >Are you updating indexed fields?  (If not, then leave the indexes and
> FKs, since they won't be touched.)
>
> Just checked regdate is not indexed so I will leave them in place.
>
>
> Would it make sense to make a back up of the table then execute update
> without a transaction?
>
>
> >Always make a backup.
>
> Agreed
>
>
> How would you guys do it?
>
>
> >It depends on what percentage of the rows are being updated, which
> columns are being updated and how big the records are.
>
> Please see above, thanks
>
> *Jason Ralph*
>
> *From:* Ron <ronljohnsonjr(at)gmail(dot)com>
> *Sent:* Tuesday, June 23, 2020 10:57 AM
> *To:* pgsql-general(at)lists(dot)postgresql(dot)org
> *Subject:* Re: UPDATE on 20 Million Records Transaction or not?
>
> On 6/23/20 8:32 AM, Jason Ralph wrote:
>
> Hello List,
>
> PostgreSQL 11.8 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7
> 20120313 (R
>
> ed Hat 4.4.7-23), 64-bit
>
> I am planning an update on a table with 20Million records, I have been
> researching the best practices.  I will remove all indexes and foreign
> keys prior to the update, however I am not sure if I should use a
> transaction or not.
>
> My thought process is that a transaction would be easier to recover if
> something fails, however it would take more time to write to the WAL
> log in a transaction.
>
>
> Are you updating *every* row in the table?
>
> Are you updating indexed fields?  (If not, then leave the indexes and FKs,
> since they won't be touched.)
>
>
> Would it make sense to make a back up of the table then execute update
> without a transaction?
>
>
> Always make a backup.
>
>
> How would you guys do it?
>
>
> It depends on what percentage of the rows are being updated, which columns
> are being updated and how big the records are.
>
> --
> Angular momentum makes the world go 'round.
>
> This message contains confidential information and is intended only for
> the individual named. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> immediately by e-mail if you have received this e-mail by mistake and
> delete this e-mail from your system. E-mail transmission cannot be
> guaranteed to be secure or error-free as information could be intercepted,
> corrupted, lost, destroyed, arrive late or incomplete, or contain viruses.
> The sender therefore does not accept liability for any errors or omissions
> in the contents of this message, which arise as a result of e-mail
> transmission. If verification is required please request a hard-copy version.

--
Angular momentum makes the world go 'round.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David Gauthier 2020-06-23 16:04:50 getting daily stats for event counts
Previous Message Klaudie Willis 2020-06-23 15:34:45 Re: n_distinct off by a factor of 1000