Re: [GENERAL] INSERT. RETURNING for copying records

From: Michael Sacket <msacket(at)gammastream(dot)com>
To: "David Johnston" <polobo(at)yahoo(dot)com>
Cc: "'PG-General Mailing List'" <pgsql-general(at)postgresql(dot)org>
Subject: Re: [GENERAL] INSERT. RETURNING for copying records
Date: 2012-09-10 15:03:53
Message-ID: 547DD5A9-07C2-4F72-9D3D-FED732EDB739@gammastream.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


On Sep 7, 2012, at 2:19 PM, David Johnston wrote:

>
> From: pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of Michael Sacket
> Sent: Friday, September 07, 2012 2:09 PM
> To: PG-General Mailing List
> Subject: [GENERAL] INSERT… RETURNING for copying records
>
> Good Afternoon,
>
> I'm attempting to write a function that will duplicate a few records, but the catch is I need to have a mapping of the original pk to the new pk. I know I can use the RETURNING clause to get the new ids... but how to map that to the original ones is escaping me.
>
> < Setup >
>
> CREATE TABLE testing (rid serial PRIMARY KEY, category text NOT NULL, name text NOT NULL, fk_parent int4);
>
> INSERT INTO testing (category, name, fk_parent) VALUES ('cat1', 'one', NULL), ('cat1', 'one.one', 1), ('cat1', 'one.two', 1);
>
> SELECT * FROM testing;
> +-----+----------+---------+-----------+
> | rid | category | name | fk_parent |
> +-----+----------+---------+-----------+
> | 1 | cat1 | one | NULL |
> | 2 | cat1 | one.one | 1 |
> | 3 | cat1 | one.two | 1 |
> +-----+----------+---------+-----------+
>
> < Duplicating the records >
>
> INSERT INTO testing (category, name, fk_parent) (select category, name, fk_parent from testing where category='cat1') returning rid, category, name, fk_parent;
> +-----+----------+---------+-----------+
> | rid | category | name | fk_parent |
> +-----+----------+---------+-----------+
> | 4 | cat1 | one | NULL |
> | 5 | cat1 | one.one | 1 |
> | 6 | cat1 | one.two | 1 |
> +-----+----------+---------+-----------+
>
> < What I'm looking for >
> +--------------+-----+
> | original_rid | rid |
> +--------------+-----+
> | 1 | 4 |
> | 2 | 5 |
> | 3 | 6 |
> +--------------+-----+
>
> < This doesn't work >
>
> INSERT INTO testing (category, name, fk_parent) select category, name, fk_parent from testing as original where category='cat1' returning rid, category, name, fk_parent, original.rid;
>
>
> Specifically, my goal is to be able to duplicate a subset of records and map any referenced foreign keys to the new ones from the copies. I could write a pl/pgsql function to loop through the records and build the mapping as I go, but I was thinking there might be a better way. Any thoughts?
>
> Thanks!
> Michael
>
>
>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>
> Two thoughts (syntax not validated):
>
> INSERT INTO …. VALUES (non-id-cols, id)
> SELECT [non-id-cols], nextval(‘sequence’) AS new_id FROM testing
> RETURNING id, new_id
>
> There is no reason to delay the assignment of the ID until the time of insert; by polling the sequence manually you get the same effect but at a time when you have not forgotten what the old value was.

I gave that a try; however, it seems that columns from the SELECT statement are not available for use in the RETURNING clause.

>
> If for some reason you have to let the ID be auto-generated you likely need to identify the “natural key” for the record and then:
>
> WITH ins (
> INSERT …. RETURNING newid, (natural_key_cols) AS natrualkey
> )
> SELECT *
> FROM ins
> JOIN testing ON
> ins.naturalkey = (testing.natural_key cols)
>
> If there is no natural key then this method is ambiguous in the presence of multiple otherwise identical records.

I tried something along those lines using row_number(). I think perhaps it would, as you suggested, be better to poll the sequence.

WITH x as (
SELECT row_number() over (order by rid asc) as rownum, rid, category, name, fk_parent FROM testing WHERE category='cat1'
),
y as (
INSERT INTO test (name, fk_parent) select 'cat1-copy', name, fk_parent FROM x returning rid
),
z as (
SELECT row_number() over (order by rid asc) as rownum, rid FROM y
)
SELECT x.rownum, z.rid as new_rid, x.rid as org_rid FROM z, x WHERE z.rownum=x.rownum;

Ultimately, I think doing a loop using pl/pgsql isn't so bad considering the number of records generally copied is small.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Craig Gibson 2012-09-10 16:34:29 Performance issue with cross table updates
Previous Message Amit Kapila 2012-09-10 14:57:37 Behavior difference for walsender and walreceiver for n/w breakdown case