Quick Links

Re: Normalizing Unnormalized Input

From:	"David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To:	Stephen Froehlich <s(dot)froehlich(at)cablelabs(dot)com>
Cc:	"pgsql-novice(at)postgresql(dot)org" <pgsql-novice(at)postgresql(dot)org>
Subject:	Re: Normalizing Unnormalized Input
Date:	2017-06-20 23:10:46
Message-ID:	CAKFQuwbEMiORC8cAm3AmvQGdSYG9usBA541DsDC1zKN1JvV-Ww@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-novice

On Tue, Jun 20, 2017 at 3:50 PM, Stephen Froehlich
<s(dot)froehlich(at)cablelabs(dot)com> wrote:
> The part of the problem that I haven’t solved conceptually yet is how to
> normalize the incoming data.

The specifics of the data matter but...if at all possible I do something like:

BEGIN
CREATE TEMP TABLE tt
COPY tt FROM STDIN
INSERT NEW RECORDS into t FROM tt - one statement (per target table)
UPDATE EXISTING RECORDS in t USING tt - one statement (per target table)
END

I don't get why (or how) you'd "rename the table into a temp table"...

Its nice that we've add upsert but it seems more useful for streaming
compared to batch. At scale you should try to avoid collisions in the
first place.

Temporary table names only need to be unique within the session.

The need for indexes on the temporary table are usually limited since
the goal is to move large subsets of it around all at once.

David J.

In response to

Normalizing Unnormalized Input at 2017-06-20 22:50:48 from Stephen Froehlich

Responses

Re: Normalizing Unnormalized Input at 2017-06-21 01:19:43 from Stephen Froehlich

Browse pgsql-novice by date

	From	Date	Subject
Next Message	Stephen Froehlich	2017-06-21 01:19:43	Re: Normalizing Unnormalized Input
Previous Message	Stephen Froehlich	2017-06-20 22:50:48	Normalizing Unnormalized Input