Quick Links

Re: Duplicate deletion optimizations

From:	Marc Eberhard <eberhardma(at)googlemail(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Duplicate deletion optimizations
Date:	2012-01-06 20:22:46
Message-ID:	CAPaGL57q3h+Z7d=aRSzPrx4Uu0A9w8+eMZ+p=B1FkHKFe22KoA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Hi Samuel!

On 6 January 2012 20:02, Samuel Gendler <sgendler(at)ideasculptor(dot)com> wrote:
> Have you considered doing the insert by doing a bulk insert into a temp
> table and then pulling rows that don't exist across to the final table in
> one query and updating rows that do exist in another query? I did a very
> brief scan of the SO thread and didn't see it suggested. Something like
> this:
>
> update stats_5mn set count = count + t.count
> from temp_table t
> where stats_5mn.t_value = t.t_value and stats_5mn.t_record and
> stats_5mn.output_id = t.output_id;
>
> insert into stats_5mn
> select * from temp_table t
> where not exists (
> select 1 from stats_5mn s
> where s.t_value = t.t_value and s.t_record = t.t_record and s.output_id =
> t.output_id
> );
>
> drop table temp_table;

Am I right to assume that the update/insert needs to be placed into a
begin / end transaction block if such batch uploads might happen
concurrently? Doesn't seem to be the case for this question here, but
I like the solution and wonder if it works under more general
circumstances.

What's the overhead of creating and dropping a temporary table? Is it
only worth doing this for a large number of inserted/updated elements?
What if the number of inserts/updates is only a dozen at a time for a
large table (>10M entries)?

Thanks,
Marc

In response to

Re: Duplicate deletion optimizations at 2012-01-06 20:02:24 from Samuel Gendler

Responses

Re: Duplicate deletion optimizations at 2012-01-06 20:38:30 from Samuel Gendler

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Samuel Gendler	2012-01-06 20:38:30	Re: Duplicate deletion optimizations
Previous Message	Samuel Gendler	2012-01-06 20:02:24	Re: Duplicate deletion optimizations