From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Yang Zhang <yanghatespam(at)gmail(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Optimizing bulk update performance |
Date: | 2013-04-27 15:20:04 |
Message-ID: | 9699.1367076004@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Yang Zhang <yanghatespam(at)gmail(dot)com> writes:
> You're right, we're only sequentially issuing (unprepared) UPDATEs.
You definitely want to fix both parts of that, then.
> If we ship many UPDATE statements per call to our DB API's execution
> function (we're using Python's psycopg2 if that matters, but I think
> that just binds libpq), would that avoid the network round trip per
> statement?
Possibly, not sure how psycopg2 handles that.
> If not, what if we use anonymous procedures (DO) to run multiple
> UPDATE statements?
I don't think an anonymous procedure as such would result in any
plan caching, at least not unless you could write it to have a single
UPDATE in a loop.
> Finally, we could use the technique highlighted in my third bullet and
> use COPY (or at least multiple-value INSERT), then merging the new
> data with the old. Would that be the most direct route to maximum
> performance?
It might help, you'd need to try it.
> In any case, I assume deleting and rebuilding indexes is important
> here, yes? But what about raising checkpoint_segments - does this
> actually help sustained throughput?
If you're updating as much as 50% of the table, and you don't need the
indexes for other purposes meanwhile, dropping and rebuilding them would
be worth trying.
Also, you definitely want checkpoint_segments large enough so that
checkpoints are at least a few minutes apart. Excess checkpoints do
represent a sustained drag on performance because they mean a greater
volume of disk writes.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Yang Zhang | 2013-04-27 17:40:20 | Re: Basic question on recovery and disk snapshotting |
Previous Message | Tobias Oberstein | 2013-04-27 13:19:00 | outgoing TCP from custom background worker? |