From: | Emi Lu <emilu(at)encs(dot)concordia(dot)ca> |
---|---|
To: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Which update action quicker? |
Date: | 2014-09-24 14:13:05 |
Message-ID: | 5422D171.5060801@encs.concordia.ca |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hello,
> For a big table with more than 10 Million records, may I know which update is
> quicker please?
> (1) update t1
> set c1 = a.c1
> from a
> where pk and
> t1.c1 <> a.c1;
> ......
> update t1
> set c_N = a.c_N
> from a
> where pk and
> t1.c_N <> a.c_N;
>
>
> (2) update t1
> set c1 = a.c1 ,
> c2 = a.c2,
> ...
> c_N = a.c_N
> from a
> where pk AND
> (t1.c1, c2...c_N) <> (a.c1, c2... c_N)
Probably (2). <> is not indexable, so each update will have to perform a
sequential scan of the table. With (2), you only need to scan it once,
with (1) you have to scan it N times. Also, method (1) will update the
same row multiple times, if it needs to have more than one column updated.
> Or other quicker way for update action?
If a large percentage of the table needs to be updated, it can be faster
to create a new table, insert all the rows with the right values, drop
the old table and rename the new one in its place. All in one transaction.
The situation is:
(t1.c1, c2, ... c_N) <> (a.c1, c2...c_N) won't return too many diff records. So, the calculation will only be query most of the case.
But if truncate/delete and copy will cause definitely write all more than 10 million data.
If for situation like this, will it still be quicker to delete/insert quicker?
Thank you
Emi
From | Date | Subject | |
---|---|---|---|
Next Message | Imre Samu | 2014-09-24 14:38:39 | Re: postgres 9.3 vs. 9.4 |
Previous Message | Heikki Linnakangas | 2014-09-24 13:48:36 | Re: Which update action quicker? |