Multiple occurence

From: Nipuna <nipunajoset(at)gmail(dot)com>
To: pgsql-novice(at)postgresql(dot)org
Subject: Multiple occurence
Date: 2014-03-17 14:29:40
Message-ID: CAPCz1_1-uwKSg9m8=OMyKH+EUPdTeUvsRsAVLBS1YXz3oeW5cg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Hi,

I have large table named duplicate_files (47GB) shown below. I need to
find the multiple occurrence of the file_name with same size in each path.

FILE_NAME
FILESIZE
FULL_PATH ABC.txt 12 I_12_122 ABC.txt 14 I_12_123 ABC.txt 12 I_12_125
ABC.txt 12 I_13_156 ABC.txt 14 I_14_123 ABC.txt 12 I_11_125 ABC.txt 15
I_12_123 ABC.txt 16 I_12_123 ABC.txt 11 I_12_123

The output is shown below.

FILE_NAME FILESIZE
FULL_PATH ABC.txt 12 I_12_122 ABC.txt
12 I_12_125 ABC.txt
12 I_13_156 ABC.txt 12 I_11_125
I used the query below to get the output.But it took me 6 hrs to get the
output. Is there any other better way to increase the speed for faster
results?

select file_name,filesize, full_path from duplicate_files f1 where

(file_name,filesize) in (select file_name,filesize from duplicate_files
group by file_name,filesize having count(file_name) >1);

Any help or advice appreciated. Thanks

Regards,
Nipuna

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message David Johnston 2014-03-17 15:36:44 Re: Multiple occurence
Previous Message David Johnston 2014-03-17 13:52:07 Re: joining 2 Tables.