From: | Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> |
---|---|
To: | Israel Brewster <ijbrewster(at)alaska(dot)edu>, pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: UPDATE many records |
Date: | 2020-01-06 19:07:14 |
Message-ID: | 0dc9dec2-8227-f489-b38a-f034b8746ae6@aklaver.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 1/6/20 10:36 AM, Israel Brewster wrote:
> Thanks to a change in historical data, I have a need to update a large
> number of records (around 50 million). The update itself is straight
> forward, as I can just issue an "UPDATE table_name SET
> changed_field=new_value();" (yes, new_value is the result of a stored
> procedure, if that makes a difference) command via psql, and it should
> work. However, due to the large number of records this command will
> obviously take a while, and if anything goes wrong during the update
> (one bad value in row 45 million, lost connection, etc), all the work
> that has been done already will be lost due to the transactional nature
> of such commands (unless I am missing something).
>
> Given that each row update is completely independent of any other row, I
> have the following questions:
>
> 1) Is there any way to set the command such that each row change is
> committed as it is calculated?
Pretty sure:
UPDATE table_name SET changed_field=new_value();
is seen as a single statement and is all or none.
If you want to go row by row you will need to have the statement run on
a row by row basis or maybe in batches.
> 2) Is there some way to run this command in parallel in order to better
> utilize multiple processor cores, other than manually breaking the data
> into chunks and running a separate psql/update process for each chunk?
> Honestly, manual parallelizing wouldn’t be too bad (there are a number
> of logical segregations I can apply), I’m just wondering if there is a
> more automatic option.
This is good time to ask what Postgres version?
I am still working out the recent parallel query system additions. Not
sure if it applies to UPDATE or not.
> ---
> Israel Brewster
> Software Engineer
> Alaska Volcano Observatory
> Geophysical Institute - UAF
> 2156 Koyukuk Drive
> Fairbanks AK 99775-7320
> Work: 907-474-5172
> cell: 907-328-9145
>
--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com
From | Date | Subject | |
---|---|---|---|
Next Message | Christopher Browne | 2020-01-06 19:08:09 | Re: UPDATE many records |
Previous Message | Justin | 2020-01-06 19:05:04 | Re: UPDATE many records |