Re: Most efficient way to insert without duplicates

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: François Beausoleil <francois(at)teksol(dot)info>
Cc: Forums postgresql <pgsql-general(at)postgresql(dot)org>
Subject: Re: Most efficient way to insert without duplicates
Date: 2013-04-17 18:15:36
Message-ID: CAMkU=1xkN0zRNoU4o4SwgFTEjc6fQqA3-X7Z=FyHdB-gnRdMwg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Apr 17, 2013 at 4:26 AM, François Beausoleil
<francois(at)teksol(dot)info>wrote:

>
>
> Insert on public.persona_followers (cost=139261.12..20483497.65
> rows=6256498 width=16) (actual time=4729255.535..4729255.535 rows=0 loops=1)
> Buffers: shared hit=33135295 read=4776921
> -> Subquery Scan on t1 (cost=139261.12..20483497.65 rows=6256498
> width=16) (actual time=562265.156..578844.999 rows=6819520 loops=1)
>

It looks like 12% of the time is being spent figuring out what rows to
insert, and 88% actually doing the insertions.

So I think that index maintenance is killing you. You could try adding a
sort to your select so that rows are inserted in index order, or inserting
in batches in which the batches are partitioned by service_id (which is
almost the same thing as sorting, since service_id is the lead column)

Cheers,

Jeff

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Aleksey Tsalolikhin 2013-04-17 18:26:59 Re: How large can a PostgreSQL database get?
Previous Message Thomas Munro 2013-04-17 17:45:38 Re: Roadmap for Postgres on AIX