Re: update faster way

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: yudhi s <learnerdatabase99(at)gmail(dot)com>, pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: update faster way
Date: 2024-09-14 07:39:14
Message-ID: 1955d0e6cafd643520d282a74d9956340983074e.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sat, 2024-09-14 at 08:43 +0530, yudhi s wrote:
> We have to update a column value(from numbers like '123' to codes like 'abc'
> by looking into a reference table data) in a partitioned table with billions
> of rows in it, with each partition having 100's millions rows. As we tested

> for ~30million rows it's taking ~20minutes to update. So if we go by this
> calculation, it's going to take days for updating all the values. So my
> question is
>
> 1) If there is any inbuilt way of running the update query in parallel
> (e.g. using parallel hints etc) to make it run faster?
> 2) should we run each individual partition in a separate session (e.g. five
> partitions will have the updates done at same time from 5 different
> sessions)? And will it have any locking effect or we can just start the
> sessions and let them run without impacting our live transactions?

Option 1 doesn't exist.
Option 2 is possible, and you can even have more than one session workingr
on a single partition.

However, the strain on your system's resources and particularly the row
locks will impair normal database work.

Essentially, you can either take an extended down time or perform the updates
in very small chunks with a very low "lock_timeout" over a very long period
of time. If any of the batches fails because of locking conflicts, it has
to be retried.

Investigate with EXPLAIN (ANALYZE) why the updates take that long. It could
be a lame disk, tons of (unnecessary?) indexes or triggers, but it might as
well be the join with the lookup table, so perhaps there is room for
improvement (more "work_mem" for a hash join?).

Yours,
Laurenz Albe

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Dan Kortschak 2024-09-14 08:33:33 Re: re-novice coming back to pgsql: porting an SQLite update statement to postgres
Previous Message yudhi s 2024-09-14 03:58:52 Re: update faster way