Re: Need suggestion on how best to update 3 million rows

From: Ow Mun Heng <Ow(dot)Mun(dot)Heng(at)wdc(dot)com>
To: Alban Hertroys <alban(at)magproductions(dot)nl>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Need suggestion on how best to update 3 million rows
Date: 2007-09-06 09:23:44
Message-ID: 1189070624.17979.12.camel@neuromancer.home.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, 2007-09-06 at 11:08 +0200, Alban Hertroys wrote:
> Ow Mun Heng wrote:
> > I found 2 new ways to do this.
> >
> > option 1
> > -------
> >
> > create table foo as select unique_id, rtrim(number) as number from foo;
> > alter table add primary key...
> > create index...
> > drop org_table
> > alter table rename...
> > All this is ~10min
>
> This only works if you don't have foreign key constraints involving that
> table. Otherwise you just lost your data integrity (although I expect an
> error to be thrown).

Got it.. Don't use FK's so.. I'm safe (for now)
>
> > option 2
> > ========
> > This I saw in the mysql archives (in my laptop).. when I say this I
> > went.. WTF? This is possible?? Dang IT!
> >
> > update a set number=replace(number,'ABC ', 'ABC') where reptest like '%
> > ABC%';
>
> Ehrm... yes, nothing special about it. Basic SQL really ;)
Hmm.. I feel the salt..

> But shouldn't you be using trim() or rtrim() instead?:
>
> update table set number = trim(number)

Hmm.. didn't think of that. Next time I guess. (in all honestly, I
didn't know you can update it on the same process/column/table. I was
dumping it to a separate table and updating it..

Now I know..
>
> you could probably speed that up by only querying the records that need
> trimming, for example:
>
> create index tmp_idx on table(number) where number != trim(number);
> analyze table;
> update table set number = trim(number) where number != trim(number);

all fields in that column is affected. I have " " (5 spaces) instead
of nulls

Thanks for the pointers..

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Ron Johnson 2007-09-06 09:47:39 Re: Need suggestion on how best to update 3 million rows
Previous Message Ow Mun Heng 2007-09-06 09:20:57 Re: Need suggestion on how best to update 3 million rows