Quick Links

Re: How to do an UPDATE for all the fields that do NOT break a constraint?

From:	Matthias Karlsson <matthias(at)yacc(dot)se>
To:	Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com>
Cc:	PG-General Mailing List <pgsql-general(at)postgresql(dot)org>
Subject:	Re: How to do an UPDATE for all the fields that do NOT break a constraint?
Date:	2009-01-26 14:45:15
Message-ID:	83eb635f0901260645o27cc44e7weaa1cc94f5ed7d47@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Mon, Jan 26, 2009 at 2:53 PM, Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com> wrote:
> On Mon, Jan 26, 2009 at 9:45 PM, Matthias Karlsson <matthias(at)yacc(dot)se> wrote:
>> On Mon, Jan 26, 2009 at 2:09 PM, Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com> wrote:
>>> I wonder if this is an SQL limitation or something I'm missing in the
>>> PG manual, but I need to run an update on my database (to replace the
>>> value of a column to match a new design structure).
>>>
>>> Due to the new business logic, the replaced value of a field may end
>>> up being already present in the database in another record. This leads
>>> to unique key violations when I run the update.
>>>
>>> My question: I don't mind if the update transaction skips the records
>>> where the key would be violated (this preservation is in fact what we
>>> want) but these are only about 2% of the overall updatable records.
>>>
>>> Is there anyway to make the transaction go through with the remaining
>>> 98% of the update SQL which will in fact NOT violate the unique
>>> constraint?
>>>
>>> --
>>> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-general
>>>
>>
>> You could always extend your update statement to include an additional
>> check to see if there are already rows present with the same value in
>> the field you are talking about.
>>
>> // Matthias
>>
>
>
>
> Thanks Matthias, but this seems a little recursive to me and I don't
> know how to do the SQL.
>
> Here is my SQL thus far. The table is "testimonials". Basically the
> column "user_alias" needs to be replaced to delete any mention of a
> user's "api_key". Both of these are fields in the same table, hence
> the replace logic below.
>
> As you will also see, based on our business logic, I have already kept
> all the related IDs in a separate small table called
> "testimonials_temp". This should speed up the process quite a bit
> because instead of going through 5 million IDs, we just loop through
> around 400,000.
>
>
>
> update testimonials
> set user_alias = replace(user_alias, '-'||api_key, '')
> where
> id in (select id from testimonials_temp)
> ;
>
>
>
> The problem is that after being replaced like that the "user_alias"
> column has a problem, because some user_aliases already exist. How
> should I add a check condition recursively? I tried this:
>
>
> update testimonials
> set user_alias = replace(user_alias, '-'||api_key, '')
> where
> id in (select id from testimonials_temp)
> and replace(user_alias, '-'||api_key, '') not in (select user_alias
> from links where user_alias = ?????))
> ;
>
>
>
> Hope I have explained this clearly. Would appreciate any ideas!
>

My idea was very similar to the SQL at the end of your post. Wouldn't
something like this work?

update testimonials u
set u.user_alias = replace(u.user_alias, '-'||api_key, '')
where
u.id in (select id from testimonials_temp) and
not exists (select id testimonials where user_alias =
replace(u.user_alias, '-'||api_key, '')))
?

Not sure if this exact SQL is correct, but in your not in expression,
you just need to make sure to refer to the user_alias of the current
row being updated.

// Matthias

In response to

Re: How to do an UPDATE for all the fields that do NOT break a constraint? at 2009-01-26 13:53:22 from Phoenix Kiula

Browse pgsql-general by date

	From	Date	Subject
Next Message	Tom Lane	2009-01-26 15:36:34	Re: where (x, y, z) in ((x1, y1, z1), (x1, y1, z1), (x1, y1, z1), (x2, y2, z2)) (not) optimized
Previous Message	Grzegorz Jaśkiewicz	2009-01-26 14:12:11	Re: where (x,y,z) in ((x1,y1, z1), (x1,y1, z1), (x1,y1, z1), (x2,y2, z2)) (not) optimized