From: | "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com> |
---|---|
To: | Håkan Jacobsson <hakan(dot)jacobsson99(at)bredband(dot)net> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: SQL for Deleting all duplicate entries |
Date: | 2007-09-05 14:55:01 |
Message-ID: | dcc563d10709050755v7ff08941n49d4ed73ae1ac312@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 9/5/07, Håkan Jacobsson <hakan(dot)jacobsson99(at)bredband(dot)net> wrote:
> Hi,
>
> I want to create a DELETE statement which deletes duplicates
> in a table.
>
> That is, I want to remove all rows - but one - having three
> columns with the same data (more columns exist and there the
> data varies).
Assuming you've got a KNOWN unique id field (adding one if you don't)
you can do something like:
select * from table t1 join table t2 on (t1.field1=t2.field1 AND
t1.field2=t2.field2 AND t1.field3=t2.field3 AND t1.uid>t2.uid)
That should get the ids of all but one of the matching rows. then
just use that in a subselect:
begin;
delete from table where uid in (select * from table t1 join table t2
on (t1.field1=t2.field1 AND t1.field2=t2.field2 AND
t1.field3=t2.field3 AND t1.uid>t2.uid) );
(check for dups / lost data)
commit;
or something like that.
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2007-09-05 14:55:45 | Re: problem with transactions in VB.NET using npgsql |
Previous Message | Scott Marlowe | 2007-09-05 14:47:35 | Re: SQL query with IFs (?) to "Eliminate" NULL Values |