Hash Function: MD5 or other?

From: Peter Fein <pfein(at)pobox(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Hash Function: MD5 or other?
Date: 2005-06-13 22:49:59
Message-ID: 20050613174959.6fb1df80@layout.pfein.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi-

I wanted to use a partially unique index (dependent on a flag) on a TEXT
column, but the index row size was too big for btrees. See the thread
"index row size 2728 exceeds btree maximum, 2713" from the beginning of
this month for someone with a similar problem. In it, someone suggests
indexing on a hash of the text. I'm fine with this, as the texts in
question are similar enough to each other to make collisions unlikely
and a collision won't really cause any serious problems.

My question is: is the builtin MD5 appropriate for this use or should I
be using a function from pl/something? Figures on collision rates would
be nice as well - the typical chunk of text is probably 1k-8k.

Thanks!

--
Peter Fein pfein(at)pobox(dot)com 773-575-0694

Basically, if you're not a utopianist, you're a schmuck. -J. Feldman

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Sean Davis 2005-06-13 23:14:48 Re: [HACKERS] mirroring oracle database in pgsql
Previous Message Jonah H. Harris 2005-06-13 22:48:33 Re: [HACKERS] mirroring oracle database in pgsql