Quick Links

Re: Duplicate Row Removal

From:	Berend Tober <btober(at)seaworthysys(dot)com>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Duplicate Row Removal
Date:	2005-11-05 19:09:40
Message-ID:	436D0374.60308@seaworthysys.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgadmin-hackers pgsql-general

Dean Gibson (DB Administrator) wrote:

> CREATE TABLE new_name AS SELECT DISTINCT * FROM old_name;
>
> DROP TABLE old_name;
>
> ALTER TABLE new_name RENAME TO old_name;

The problem with this technique is that it doesn't account for indexes,
foreign key references, and other dependencies.

Another approach is to temporarily add an integer column, populate it
with sequential values, and then use that new column to uniquely
identify the rows that are otherwise duplicates. Then you can use
aggregation to identify and delete the rows you don't need, followed by
dropping the temporary extra column. HTH.

-- BMT

>
> On 2005-11-04 17:15, Peter Atkins wrote:
>
>> All,
>>
>> I have a duplicate row problem and to make matters worse some tables
>> don't have a PK or any unique identifier.
>> Anyone have any thoughts on how to remove dups?
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>

In response to

Re: Duplicate Row Removal at 2005-11-05 06:34:39 from Dean Gibson (DB Administrator)

Responses

Re: [GENERAL] Duplicate Row Removal at 2005-11-05 20:45:41 from Florian G. Pflug

Browse pgadmin-hackers by date

	From	Date	Subject
Next Message	Florian G. Pflug	2005-11-05 20:28:19	Re: [GENERAL] Missing variable "role" in "pg_settings"?
Previous Message	Tom Lane	2005-11-05 19:08:49	Re: Missing variable "role" in "pg_settings"?

Browse pgsql-general by date

	From	Date	Subject
Next Message	John Sidney-Woollett	2005-11-05 19:31:14	Re: joining a query with a select count(*)
Previous Message	Tom Lane	2005-11-05 19:08:49	Re: Missing variable "role" in "pg_settings"?