From: | andrew(at)pillette(dot)com |
---|---|
To: | okparanoid(at)free(dot)fr |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: update 600000 rows |
Date: | 2007-12-16 05:21:47 |
Message-ID: | 200712160521.lBG5Llf25219@pillette.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Loc Marteau <okparanoid(at)free(dot)fr> wrote ..
> Steve Crawford wrote:
> > If this
> > is correct, I'd first investigate simply loading the csv data into a
> > temporary table, creating appropriate indexes, and running a single
> > query to update your other table.
My experience is that this is MUCH faster. My predecessor in my current position was doing an update from a csv file line by line with perl. That is one reason he is my predecessor. Performance did not justify continuing his contract.
> i can try this. The problem is that i have to make an insert if the
> update don't have affect a rows (the rows don't exist yet). The number
> of rows affected by insert is minor regards to the numbers of updated
> rows and was 0 when i test my script). I can do with a temporary table
> : update all the possible rows and then insert the rows that are in
> temporary table and not in the production table with a 'not in'
> statement. is this a correct way ?
That's what I did at first, but later I found better performance with a TRIGGER on the permanent table that deletes the target of an UPDATE, if any, before the UPDATE. That's what PG does anyway, and now I can do the entire UPDATE in one command.
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2007-12-16 06:39:48 | Re: RAID arrays and performance |
Previous Message | Greg Smith | 2007-12-15 15:47:02 | Re: update 600000 rows |