Quick Links

Re: The fastest way to update thousands of rows in moderately sized table

From:	Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To:	twoflower <standa(dot)kurik(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject:	Re: The fastest way to update thousands of rows in moderately sized table
Date:	2015-07-23 20:55:55
Message-ID:	55B154DB.30305@aklaver.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On 07/23/2015 01:17 PM, twoflower wrote:
> Hello, I have a table with 30 million records in which I need to update
> a single column for a couple of thousands of rows, let's say 10 000. The
> new column value is identical for all matching rows. Doing
>
> |update "TRANSLATION" set fk_assignment where fk_job = 1000; |
>
> takes 45 seconds. I understand that |UPDATE| is basically an |INSERT|
> followed by |DELETE| but I was hoping I could do better than that. I
> found a suggestion to use a temporary table to speed things up, so now I
> have this:
>
> |create unlogged table "temp_table" as
> select id, fk_assignment
> from "TRANSLATION"
> where fk_job = 1000;
>
> update "temp_table" set fk_assignment = null;
>
> update "TRANSLATION" _target
> set fk_assignment = _source.fk_assignment
> from "temp_table" _source
> where _target.id = _source.id;
>
> drop table "temp_table";

Have you tried wrapping the above in a BEGIN/COMMIT block?

So:

BEGIN;

create unlogged table "temp_table" as
select id, fk_assignment
from "TRANSLATION"
where fk_job = 1000;

update "temp_table" set fk_assignment = null;

update "TRANSLATION" _target
set fk_assignment = _source.fk_assignment
from "temp_table" _source
where _target.id = _source.id;

drop table "temp_table";

COMMIT;

> |
>
> This got me to about 37 seconds. Still pretty slow. The |TRANSLATION|
> has an index and a foreign key constraint on |fk_assignment|. Removing
> the constraint brought very little benefit. Removing the index is
> probably out of question as these kind of operations are very frequent
> and the table itself is used heavily, including the index. Execution plan:
>
> |Update on "TRANSLATION" _target (cost=0.56..116987.76 rows=13983
> width=405) (actual time=43262.266..43262.266 rows=0 loops=1)
> -> Nested Loop (cost=0.56..116987.76 rows=13983 width=405) (actual
> time=0.566..146.084 rows=8920 loops=1)
> -> Seq Scan on temp_segs _source (cost=0.00..218.83 rows=13983
> width=22) (actual time=0.457..13.994 rows=8920 loops=1)
> -> Index Scan using "TRANSLATION_pkey" on "TRANSLATION" _target
> (cost=0.56..8.34 rows=1 width=391) (actual time=0.009..0.011 rows=1
> loops=8920)
> Index Cond: (id = _source.id)
>
> Planning time: 1.167 ms
> Execution time: 43262.577 ms
> |
> Is there anything else worth trying? Are these numbers something to be
> expected, from your experience?
>
> I have Postgres 9.4, the database is on SSD.
>
> Thank you very much for any suggestions.
>
> Standa
> ------------------------------------------------------------------------
> View this message in context: The fastest way to update thousands of
> rows in moderately sized table
> <http://postgresql.nabble.com/The-fastest-way-to-update-thousands-of-rows-in-moderately-sized-table-tp5859144.html>
> Sent from the PostgreSQL - general mailing list archive
> <http://postgresql.nabble.com/PostgreSQL-general-f1843780.html> at
> Nabble.com.

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

The fastest way to update thousands of rows in moderately sized table at 2015-07-23 20:17:46 from twoflower

Responses

Re: The fastest way to update thousands of rows in moderately sized table at 2015-07-23 21:00:50 from twoflower

Browse pgsql-general by date

	From	Date	Subject
Next Message	Tim Clarke	2015-07-23 20:58:21	Re: Using the database to validate data
Previous Message	Adrian Klaver	2015-07-23 20:48:36	Re: Using the database to validate data