Re: Fast, stable, portable hash function producing 4-byte or 8-byte values?

From: Erwin Brandstetter <brsaweda(at)gmail(dot)com>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, pgsql-general(at)postgresql(dot)org
Subject: Re: Fast, stable, portable hash function producing 4-byte or 8-byte values?
Date: 2019-12-11 18:01:43
Message-ID: CAGHENJ6GibwodRSmgsQ2tLnXs9HcNeBJ2e2NYtwCB4Pbfqm-ow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks for the suggestion. Seems like a good assumption and I have been
using hashtext() in the past. But I am uncertain whether it is the best
option.

Guess Tom's warning in
https://www.postgresql.org/message-id/9434.1568839177@sss.pgh.pa.us about
portability only refers to hashtextextended() and friends not being there
in Postgres 10 or older.

But why are none of these functions documented? Does the project still not
...

> want people to rely on them continuing to have exactly the current
behavior.

I am not complaining, maybe just nobody did the work. But it's also
mentioned in this old thread, that hastext() changed in the past. Is all of
that outdated and we are welcome to use those functions for indexing?
https://www.postgresql.org/message-id/flat/24463.1329854466%40sss.pgh.pa.us#c18e62281dc78f6d64c1a4d41ab8569b
<https://www.postgresql.org/message-id/24463.1329854466@sss.pgh.pa.us>

Filtering with amprocnum = 2 gets functions producing bigint in Postgres 11
or later. Not sure about the exact meaning of amprocnum, manual says
"Support function number".

Remaining problem either way: no hash function returning bigint for
Postgres 10.

Regards
Erwin

On Tue, Dec 10, 2019 at 11:13 PM Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
wrote:

> On Tue, 2019-12-10 at 22:11 +0100, Erwin Brandstetter wrote:
> > I am looking for stable hash functions producing 8-byte or 4-byte hashes
> from long text values in Postgres 10 or later.
> >
> > [...]
> >
> > There is an old post from 2012 by Tom Lane suggesting that hashtext()
> and friends are not for users:
> >
> > https://www.postgresql.org/message-id/24463.1329854466%40sss.pgh.pa.us
>
> Changing a hash function would corrupt hash indexes, wouldn't it?
>
> So I'd expect these functions to be pretty stable:
>
> SELECT amp.amproc
> FROM pg_amproc AS amp
> JOIN pg_opfamily AS opf ON amp.amprocfamily = opf.oid
> JOIN pg_am ON opf.opfmethod = pg_am.oid
> WHERE pg_am.amname = 'hash'
> AND amp.amprocnum = 1;
>
> Or at least there would have to be a fat warning in the release notes
> to reindex hash indexes.
>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Erwin Brandstetter 2019-12-11 18:07:09 Re: Fast, stable, portable hash function producing 4-byte or 8-byte values?
Previous Message Adrian Klaver 2019-12-11 17:19:53 Re: PGUSER and initdb