Quick Links

Re: Import large data set into a table and resolve duplicates?

From:	Eugene Dzhurinsky <jdevelop(at)gmail(dot)com>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Import large data set into a table and resolve duplicates?
Date:	2015-02-15 17:36:44
Message-ID:	20150215173644.GB4901@devbox
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Sun, Feb 15, 2015 at 10:00:50AM -0600, John McKown wrote:
> UPDATE patch_data SET already_exists=((SELECT TRUE FROM dictionary WHERE
> dictionary.series = patch_data.series));

Since the "dictionary" already has an index on the "series", it seems that
patch_data doesn't need to have any index here.

> At this point, the table patch_data has been updated such that if the
> series data in it already exists, the "already_exists" column is now TRUE
> instead of the initial FALSE. This means that we need to insert all the
> series data in "patch_data" which does not exist in "dictionary" ( i.e.
> "already_exists" is FALSE in "patch_data") into "dictionary".
>
> INSERT INTO dictionary(series) SELECT series FROM patch_data WHERE
> already_exists = FALSE;

At this point "patch_data" needs to get an index on "already_exists = false",
which seems to be cheap.

> UPDATE patch_data SET id=((SELECT id FROM dictionary WHERE
> dictionary.series = patch_data.series));

No index needed here except the existing one on "dictionary".

That looks really promising, thank you John! I need only one index on the
"patch_data" table, and I will re-use the existing index on the "dictionary".

Thanks again!

--
Eugene Dzhurinsky

In response to

Re: Import large data set into a table and resolve duplicates? at 2015-02-15 16:00:50 from John McKown

Responses

Re: Import large data set into a table and resolve duplicates? at 2015-02-15 17:58:37 from Francisco Olarte

Browse pgsql-general by date

	From	Date	Subject
Next Message	Francisco Olarte	2015-02-15 17:50:57	Fwd: Import large data set into a table and resolve duplicates?
Previous Message	Eugene Dzhurinsky	2015-02-15 17:29:10	Re: Import large data set into a table and resolve duplicates?