Re: Danger of idiomatic plpgsql loop for merging data

From: "J(dot) Greg Davidson" <jgd(at)well(dot)com>
To: PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, lynn <lynn(at)creditlink(dot)com>, /Blank Page/ <blankpage2008(at)gmail(dot)com>
Subject: Re: Danger of idiomatic plpgsql loop for merging data
Date: 2010-08-05 00:14:16
Message-ID: 942214912.1858.1280967256804.JavaMail.root@zimbra.well.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, 2010-08-04 at 11:12 -0400, Merlin Moncure wrote:

> The infinite loop check is good but you missed the most important
> part: you need to be checking sqlerrm to see where the unique
> violation is coming from. Your original issue was that some dependent
> trigger was causing the error which is getting caught here. Your
> check should ONLY be handling unique violations on the table 'foos'.
>
> The error message (sqlerrm) will look something like this:
> ERROR: duplicate key value violates unique constraint "a_constraint_pkey"
>
> I would do something like this:
> WHEN unique_violation THEN -- another TABLE?
> this_table := false;
>
> IF SQLERRM ~ 'unique constraint "a_constraint_pkey"' THEN
> this_table := true;
> END IF;
>
> IF SQLERRM ~ 'unique constraint "another_unique_constraint"' THEN
> this_table := true;
> END IF;
>
> IF NOT this_table
> RAISE '%', SQLERRM USING ERRCODE = 'unique_violation';
> END IF;
>
> yes, this is awful. hopefully your constraints have useful names that
> are unique. IMNSHO the fully schema qualified table name should be in
> the error message.
>
> merlin

My infinite loop check is probably paranoia if I put in the check
you suggest. The check you suggest is absolutely correct, yet it
cannot be coded portably. The unique constraints have whatever name
PostgreSQL generates in response to the PRIMARY KEY or UNIQUE keywords.
I have to deal with 48 different tables in the current codebase, so
both maintenance and boilerplate reduction are important. This leads
me to suggest the following new idiom for this kind of function,
starting with two necessary utility functions:

-- definitions needed here moved to bottom of message

CREATE OR REPLACE
FUNCTION errm_is_from_table(text, regclass) RETURNS boolean AS $$
-- Warning: Non-portable implementation!
-- Based on current PostgreSQL SQLERRM strings like:
-- duplicate key value violates unique constraint ...
-- ... "foos_pkey"
SELECT $1 LIKE '%"' || $2 || '_%"%'
$$ LANGUAGE sql;

CREATE OR REPLACE
FUNCTION errm_not_from_table(
text, regclass, regprocedure, VARIADIC text[] = '{}'::TEXT[]
) RETURNS boolean AS $$
BEGIN
IF NOT errm_is_from_table($1, $2) THEN
RETURN true;
END IF;
RAISE NOTICE '% raised: %',
$3::regproc || '(' || array_to_string($4, ',') || '): ', $1;
RETURN false;
END;
$$ LANGUAGE plpgsql;

CREATE OR REPLACE
FUNCTION get_foo(text) RETURNS foo_reftype AS $$
DECLARE
_ref foo_reftype;
_table regclass := 'foos';
this regprocedure := 'get_foo(text)';
BEGIN
LOOP
SELECT ref_ INTO _ref FROM foos WHERE name_ = $1;
IF FOUND THEN RETURN _foo; END IF;
BEGIN
INSERT INTO foos(name_) VALUES($1);
EXCEPTION
WHEN unique_violation THEN
IF errm_not_from_table(ERRM, _table, _this, $1) THEN
RAISE; -- re-raises original exception
END IF;
END;
END LOOP;
END;
$$ LANGUAGE plpgsql;

If I could move the re-raising into errm_not_from_table()
then I could make things even cleaner, but I don't know
if that's possible.

Here are the omitted definitions needed to make this
simplified example code work:

CREATE DOMAIN foo_reftype AS INTEGER;

CREATE TABLE foos (
ref_ foo_reftype PRIMARY KEY,
name_ text UNIQUE NOT NULL
);

CREATE SEQUENCE foos__seq OWNED BY foos.ref_;

CREATE FUNCTION next_foo_ref() RETURNS foo_reftype AS $$
SELECT nextval('foos__seq')::foo_reftype
$$ LANGUAGE sql;

ALTER TABLE foos ALTER COLUMN ref_ SET DEFAULT next_foo_ref();

_Greg

J. Greg Davidson

Browse pgsql-general by date

  From Date Subject
Next Message andi astowo 2010-08-05 01:54:22
Previous Message Nick 2010-08-04 22:03:39 Re: On insert duplicate row, return existing key