From: | Håkan Jacobsson <hakan(dot)jacobsson99(at)bredband(dot)net> |
---|---|
To: | <mmoncure(at)gmail(dot)com> |
Cc: | <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: SQL for Deleting all duplicate entries |
Date: | 2007-09-09 12:28:51 |
Message-ID: | 17416430.759551189340931591.JavaMail.root@ps2.bredband.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Merlin,
Its just about three columns - not any column. Two columns are
varchars and the third is
a date. The date column value is NULL for the rows for which
I want to delete the duplicates.
Yes, please, be a bit more specific!
/regards, Håkan Jacobsson
>----Ursprungligt meddelande----
>Från: mmoncure(at)gmail(dot)com
>Datum: 06-09-2007 01:56
>Till: "Håkan Jacobsson"<hakan(dot)jacobsson99(at)bredband(dot)net>
>Kopia: <pgsql-general(at)postgresql(dot)org>
>Ärende: Re: [GENERAL] SQL for Deleting all duplicate entries
>
>On 9/5/07, Håkan Jacobsson <hakan(dot)jacobsson99(at)bredband(dot)net>
wrote:
>> Hi,
>>
>> I want to create a DELETE statement which deletes
duplicates
>> in a table.
>>
>> That is, I want to remove all rows - but one - having three
>> columns with the same data (more columns exist and there
the
>> data varies).
>> For example:
>> column1
>> column2
>> column3
>> column4
>> column5
>>
>> column2 = 'test', column3 = 'hey' and column4 IS NULL for
>> several rows in the table. I want to keep just one of those
>> rows.
>>
>> Is this possible? I can't figure it out, so any help MUCH
>> appreciated!
>
>when removing duplicates, I find it is usually better to look
at this
>problem backwards...you want to select out the data you want
to keep,
>truncate the original table, and insert select the data back
in.
>
>What isn't exactly clear from your question is if you are
interested
>in only particular fields or if you want to throw out based
on any
>columns (nut just 2, 3, and 4). If so, this is a highly
irregular
>(and interesting) problem, and should prove difficult to make
>efficient.
>
>If you are only interested in three particular columns, then
it's easy.
>1. select out data you want to keep using create table
scratch SELECT
>DISTINCT ON or GROUP BY into scratch
>2. truncate main table
>3. insert into main select * from scratch
>
>for a more specific answer, you will have to provide some
more detail,
>especially regarding exactly how you determine two rows as
being
>'duplicates'.
>
>merlin
>
From | Date | Subject | |
---|---|---|---|
Next Message | Merlin Moncure | 2007-09-09 13:42:05 | Re: SQL for Deleting all duplicate entries |
Previous Message | Filip Rembiałkowski | 2007-09-09 11:01:52 | Re: work hour calculations |