Quick Links

Re: Import large data set into a table and resolve duplicates?

From:	Eugene Dzhurinsky <jdevelop(at)gmail(dot)com>
To:	Francisco Olarte <folarte(at)peoplecall(dot)com>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Import large data set into a table and resolve duplicates?
Date:	2015-02-15 17:29:10
Message-ID:	20150215172910.GA4901@devbox
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Sun, Feb 15, 2015 at 01:06:02PM +0100, Francisco Olarte wrote:
> You state below 200k rows, 50k lines per path. That is not huge unless
> "series" really big, is it?

series data is in between of 100-4096 chars

> 1.- Get the patches into a ( temp ) table, using something like \copy, call
> this patches_in.
> 2.- create (temp) table existing_out as select series, id from dictionary
> join patches_in on (series);
> 3.- delete from patches_in where series in (select series from
> existing_out);
> 4.- create (temp) table new_out as insert into dictionary (series) select
> patches_in.series from patches_in returning series, id
> 5.- Copy existing out and patches out.
> 6.- Cleanup temps.

That sounds cool, but I'm a bit worried about the performance of a lookup over
the series column and the time to create index for the "temp" table on
"series" column. But perhaps it's better to try this and if a performance will
go really bad - then do some optimizations, like partitioning etc.

Thank you!

--
Eugene Dzhurinsky

In response to

Import large data set into a table and resolve duplicates? at 2015-02-14 21:54:47 from Eugene Dzhurinsky

Responses

Fwd: Import large data set into a table and resolve duplicates? at 2015-02-15 17:50:57 from Francisco Olarte
Re: Import large data set into a table and resolve duplicates? at 2015-02-15 18:11:31 from Eugene Dzhurinsky

Browse pgsql-general by date

	From	Date	Subject
Next Message	Eugene Dzhurinsky	2015-02-15 17:36:44	Re: Import large data set into a table and resolve duplicates?
Previous Message	Rafal Pietrak	2015-02-15 17:14:46	Re: partial "on-delete set null" constraint