Quick Links

Hash Function: MD5 or other?

From:	Peter Fein <pfein(at)pobox(dot)com>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Hash Function: MD5 or other?
Date:	2005-06-13 22:49:59
Message-ID:	20050613174959.6fb1df80@layout.pfein.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Hi-

I wanted to use a partially unique index (dependent on a flag) on a TEXT
column, but the index row size was too big for btrees. See the thread
"index row size 2728 exceeds btree maximum, 2713" from the beginning of
this month for someone with a similar problem. In it, someone suggests
indexing on a hash of the text. I'm fine with this, as the texts in
question are similar enough to each other to make collisions unlikely
and a collision won't really cause any serious problems.

My question is: is the builtin MD5 appropriate for this use or should I
be using a function from pl/something? Figures on collision rates would
be nice as well - the typical chunk of text is probably 1k-8k.

Thanks!

--
Peter Fein pfein(at)pobox(dot)com 773-575-0694

Basically, if you're not a utopianist, you're a schmuck. -J. Feldman

Responses

Re: Hash Function: MD5 or other? at 2005-06-14 00:55:20 from Shelby Cain
Re: Hash Function: MD5 or other? at 2005-06-14 09:06:49 from Alex Stapleton

Browse pgsql-general by date

	From	Date	Subject
Next Message	Sean Davis	2005-06-13 23:14:48	Re: [HACKERS] mirroring oracle database in pgsql
Previous Message	Jonah H. Harris	2005-06-13 22:48:33	Re: [HACKERS] mirroring oracle database in pgsql