From: | Scott Ribe <scott_ribe(at)elevated-dev(dot)com> |
---|---|
To: | Sachin Kumar <sachinkumaras(at)gmail(dot)com> |
Cc: | pgsql-admin(at)postgresql(dot)org, krishna(at)thewebconz(dot)com, pgsql-admin(at)lists(dot)postgresql(dot)org |
Subject: | Re: how to make duplicate finding query faster? |
Date: | 2020-12-30 13:13:07 |
Message-ID: | FEEE2DC4-B506-4B34-80FE-07FCC0ADC61E@elevated-dev.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
> On Dec 30, 2020, at 12:36 AM, Sachin Kumar <sachinkumaras(at)gmail(dot)com> wrote:
>
> Hi All,
>
> I am uploading data into PostgreSQL using the CSV file and checking if there is any duplicates value in DB it should return a duplicate error. I am using below mention query.
>
> if Card_Bank.objects.filter( Q(ACCOUNT_NUMBER=card_number) ).exists():
> flag=2
> else:
> flag=1
> it is taking too much time i am using 600k cards in CSV.
>
> Kindly help me in making the query faster.
>
> I am using Python, Django & PostgreSQL.
> --
>
> Best Regards,
> Sachin Kumar
Are you checking one-by-one because your goal is not to fail the whole upload that contains the duplicates, but rather to skip only the duplicates?
If that's the case, I think you'd be better off copying the CSV straight into a temp table, using a join to delete duplicates from it, then insert the remainder into the target table, and finally drop the temp table.
From | Date | Subject | |
---|---|---|---|
Next Message | Sachin Kumar | 2020-12-30 13:24:14 | Re: how to make duplicate finding query faster? |
Previous Message | Gavan Schneider | 2020-12-30 11:21:17 | Re: how to make duplicate finding query faster? |