Tuning massive UPDATES and GROUP BY's?

From: fork <forkandwait(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Tuning massive UPDATES and GROUP BY's?
Date: 2011-03-10 15:40:53
Message-ID: loom.20110310T163020-437@post.gmane.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Given that doing a massive UPDATE SET foo = bar || ' ' || baz; on a 12 million
row table (with about 100 columns -- the US Census PUMS for the 2005-2009 ACS)
is never going to be that fast, what should one do to make it faster?

I set work_mem to 2048MB, but it currently is only using a little bit of memory
and CPU. (3% and 15% according to top; on a SELECT DISTINCT ... LIMIT earlier,
it was using 70% of the memory).

The data is not particularly sensitive; if something happened and it rolled
back, that wouldnt be the end of the world. So I don't know if I can use
"dangerous" setting for WAL checkpoints etc. There are also aren't a lot of
concurrent hits on the DB, though a few.

I am loathe to create a new table from a select, since the indexes themselves
take a really long time to build.

As the title alludes, I will also be doing GROUP BY's on the data, and would
love to speed these up, mostly just for my own impatience...

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Haas 2011-03-10 15:55:16 Re: NULLS LAST performance
Previous Message sverhagen 2011-03-10 14:10:40 Re: Performance trouble finding records through related records