Quick Links

Re: how to make duplicate finding query faster?

From:	Scott Ribe <scott_ribe(at)elevated-dev(dot)com>
To:	Sachin Kumar <sachinkumaras(at)gmail(dot)com>
Cc:	pgsql-admin(at)postgresql(dot)org, krishna(at)thewebconz(dot)com, pgsql-admin(at)lists(dot)postgresql(dot)org
Subject:	Re: how to make duplicate finding query faster?
Date:	2020-12-30 13:13:07
Message-ID:	FEEE2DC4-B506-4B34-80FE-07FCC0ADC61E@elevated-dev.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-admin

> On Dec 30, 2020, at 12:36 AM, Sachin Kumar <sachinkumaras(at)gmail(dot)com> wrote:
>
> Hi All,
>
> I am uploading data into PostgreSQL using the CSV file and checking if there is any duplicates value in DB it should return a duplicate error. I am using below mention query.
>
> if Card_Bank.objects.filter( Q(ACCOUNT_NUMBER=card_number) ).exists():
> flag=2
> else:
> flag=1
> it is taking too much time i am using 600k cards in CSV.
>
> Kindly help me in making the query faster.
>
> I am using Python, Django & PostgreSQL.
> --
>
> Best Regards,
> Sachin Kumar

Are you checking one-by-one because your goal is not to fail the whole upload that contains the duplicates, but rather to skip only the duplicates?

If that's the case, I think you'd be better off copying the CSV straight into a temp table, using a join to delete duplicates from it, then insert the remainder into the target table, and finally drop the temp table.

In response to

how to make duplicate finding query faster? at 2020-12-30 07:36:40 from Sachin Kumar

Responses

Re: how to make duplicate finding query faster? at 2020-12-30 13:24:14 from Sachin Kumar

Browse pgsql-admin by date

	From	Date	Subject
Next Message	Sachin Kumar	2020-12-30 13:24:14	Re: how to make duplicate finding query faster?
Previous Message	Gavan Schneider	2020-12-30 11:21:17	Re: how to make duplicate finding query faster?