Quick Links

Re: Using database to find file doublettes in my computer

From:	Sam Mason <sam(at)samason(dot)me(dot)uk>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Using database to find file doublettes in my computer
Date:	2008-11-18 12:36:42
Message-ID:	20081118123642.GE3829@frubble.xen.chris-lamb.co.uk
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Mon, Nov 17, 2008 at 11:22:47AM -0800, Lothar Behrens wrote:
> I have a problem to find as fast as possible files that are double or
> in other words, identical.
> Also identifying those files that are not identical.

I'd probably just take a simple Unix command line approach, something
like:

find /base/dir -type f -exec md5sum {} \; | sort | uniq -Dw 32

this will give you a list of files whose contents are identical
(according to an MD5 hash). An alternative would be to put the hashes
into a database and run the matching up there.

Sam

In response to

Using database to find file doublettes in my computer at 2008-11-17 19:22:47 from Lothar Behrens

Responses

Re: Using database to find file doublettes in my computer at 2008-11-18 12:42:28 from Gerhard Heift

Browse pgsql-general by date

	From	Date	Subject
Next Message	Gerhard Heift	2008-11-18 12:42:28	Re: Using database to find file doublettes in my computer
Previous Message	Magnus Hagander	2008-11-18 12:28:42	Re: [GENERAL] db_user_namespace, md5 and changing passwords