Re: Most efficient way to insert without duplicates

From: Moshe Jacobson <moshe(at)neadwerx(dot)com>
To: François Beausoleil <francois(at)teksol(dot)info>
Cc: Forums postgresql <pgsql-general(at)postgresql(dot)org>
Subject: Re: Most efficient way to insert without duplicates
Date: 2013-04-17 18:44:21
Message-ID: CAJ4CxLmFHQmpavxj+LwpodeL4tAMwT=D6DG5yA=nxW0wx7y6pA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Apr 16, 2013 at 10:51 PM, François Beausoleil
<francois(at)teksol(dot)info>wrote:

> INSERT INTO persona_followers(service_id, follower_id, valid_at)
> SELECT service_id, follower_id, NOW()
> FROM (SELECT DISTINCT service_id, follower_id FROM import) AS import
> WHERE NOT EXISTS(SELECT * FROM persona_followers WHERE import.service_id
> = persona_followers.service_id AND import.follower_id =
> persona_followers.follower_id);
>

Try this for your insert query instead:

insert into persona_followers( service_id, follower_id, valid_at )
select i.service_id, i.follower_id, now()
from import i
left join persona_followers pf on i.service_id = pf.service_id and
i.follower_id = pf.follower_id
where pf.service_id is null

order by 1,2;

This will insert only those rows that are not already present, and involves
no subqueries and only one join.

--
Moshe Jacobson
Nead Werx, Inc. | Manager of Systems Engineering
2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339
moshe(at)neadwerx(dot)com | www.neadwerx.com

"Quality is not an act, it is a habit." -- Aristotle

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Alfonso Afonso 2013-04-17 18:48:42 Re: SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
Previous Message Yuriy Rusinov 2013-04-17 18:33:50 Re: GSL onto postgresql server 9.2