Quick Links

Re: how to make duplicate finding query faster?

From:	Sachin Kumar <sachinkumaras(at)gmail(dot)com>
To:	Scott Ribe <scott_ribe(at)elevated-dev(dot)com>
Cc:	pgsql-admin(at)postgresql(dot)org, krishna(at)thewebconz(dot)com, pgsql-admin(at)lists(dot)postgresql(dot)org
Subject:	Re: how to make duplicate finding query faster?
Date:	2020-12-30 13:24:14
Message-ID:	CALg-PKB92uV1v_R2JTLsr27xUdJaEru-b=Frig_YC=jy9L3X6A@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-admin

Hi Scott,

Yes, I am checking one by one because my goal is to fail the whole upload
if there is any duplicate entry and to inform the user that they have a
duplicate entry in the file.

Regards
Sachin

On Wed, Dec 30, 2020 at 6:43 PM Scott Ribe <scott_ribe(at)elevated-dev(dot)com>
wrote:

> > On Dec 30, 2020, at 12:36 AM, Sachin Kumar <sachinkumaras(at)gmail(dot)com>
> wrote:
> >
> > Hi All,
> >
> > I am uploading data into PostgreSQL using the CSV file and checking if
> there is any duplicates value in DB it should return a duplicate error. I
> am using below mention query.
> >
> > if Card_Bank.objects.filter( Q(ACCOUNT_NUMBER=card_number) ).exists():
> > flag=2
> > else:
> > flag=1
> > it is taking too much time i am using 600k cards in CSV.
> >
> > Kindly help me in making the query faster.
> >
> > I am using Python, Django & PostgreSQL.
> > --
> >
> > Best Regards,
> > Sachin Kumar
>
> Are you checking one-by-one because your goal is not to fail the whole
> upload that contains the duplicates, but rather to skip only the duplicates?
>
> If that's the case, I think you'd be better off copying the CSV straight
> into a temp table, using a join to delete duplicates from it, then insert
> the remainder into the target table, and finally drop the temp table.

Best Regards,
Sachin Kumar

In response to

Re: how to make duplicate finding query faster? at 2020-12-30 13:13:07 from Scott Ribe

Responses

Re: how to make duplicate finding query faster? at 2020-12-30 13:28:26 from Scott Ribe

Browse pgsql-admin by date

	From	Date	Subject
Next Message	Scott Ribe	2020-12-30 13:28:26	Re: how to make duplicate finding query faster?
Previous Message	Scott Ribe	2020-12-30 13:13:07	Re: how to make duplicate finding query faster?