Re: Fast, stable, portable hash function producing 4-byte or 8-byte values?

From: Ron <ronljohnsonjr(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Fast, stable, portable hash function producing 4-byte or 8-byte values?
Date: 2019-12-16 02:23:25
Message-ID: b69f0a7d-9af7-9397-2dff-3e49041d2fac@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 12/15/19 3:59 PM, George Neuner wrote:
> On Tue, 10 Dec 2019 18:00:02 -0600, Ron <ronljohnsonjr(at)gmail(dot)com>
> wrote:
>
>> On 12/10/19 3:11 PM, Erwin Brandstetter wrote:
>>> I am looking for stable hash functions producing 8-byte or 4-byte hashes
>>> from long text values in Postgres 10 or later.
>>>
>>> There is md5(), the result of which can be cast to uuid. This reliably
>>> produces practically unique, stable 16-byte values. I have usecases where
>>> an 8-byte or even 4-byte hash would be good enough to make collisions
>>> reasonably unlikely. (I can recheck on the full string) - and expression
>>> indexes substantially smaller. I could truncate md5 and cast back and
>>> forth, but that seems like a lot of wasted computation. Are there
>>> suggestions for text hash functions that are
>>> - fast
>>> - keep collisions to a minimum
>>> - stable across major Postgres versions (so expression indexes don't break)
>>> - croptographic aspect is not needed (acceptable, but no benefit)
>> What about a CRC32 function?  It's fast, and an SSE4 instruction has been in
>> Intel CPUs for about 10 years.
> On long text CRC will not be as discriminating as a real cryptohash,

When specifying a 4 byte hash, something must be sacrificed...

--
Angular momentum makes the world go 'round.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Grigory Smolkin 2019-12-16 08:05:02 Re: Is there any tool that provides Physical Backup plus PITR for a single database ( Not the whole PG instance ) ?
Previous Message Ron 2019-12-16 02:21:44 Re: Is there any tool that provides Physical Backup plus PITR for a single database ( Not the whole PG instance ) ?