Re: Duplicates Processing

From: Tim Landscheidt <tim(at)tim-landscheidt(dot)de>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: Duplicates Processing
Date: 2010-10-08 18:42:25
Message-ID: m3iq1cr0fi.fsf@passepartout.tim-landscheidt.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Gary Chambers <gwchamb(at)gmail(dot)com> wrote:

> I've been provided a CSV file of parts that contains duplicates of
> properties (e.g. resistors have a wattage, tolerance, and temperature
> coefficient property) of those parts that differ by a manufacturer
> part number. What I'd like to do is to process this file and, upon
> encountering one of the duplicates, take that part with its new part
> number and move it to a part substitutes table. It seems like it
> should be pretty simple, but I can't seem to generate a query or a
> function to accomplish it. I'd greatly appreciate any insight or
> assistance with solving this problem. Thank you very much in advance.

You can - for example - create a query with a call to
ROW_NUMBER() and then process the matching rows (untested):

| INSERT INTO substitutes ([...])
| SELECT [...] FROM
| (SELECT *,
| ROW_NUMBER() OVER (PARTITION BY wattage, tolerance, temperature
| ORDER BY part_number) AS RN
| FROM parts) AS SubQuery
| WHERE RN > 1;

| DELETE FROM parts
| WHERE primary_key IN
| (SELECT primary_key FROM
| (SELECT *,
| ROW_NUMBER() OVER (PARTITION BY wattage, tolerance, temperature
| ORDER BY part_number) AS RN
| FROM parts) AS SubQuery
| WHERE RN > 1);

Tim

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Gary Chambers 2010-10-08 19:42:24 Re: Duplicates Processing
Previous Message Gary Chambers 2010-10-08 15:12:10 Duplicates Processing