Postgres 11, partitioning with a custom hash function

From: Harry B <harrysungod(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Postgres 11, partitioning with a custom hash function
Date: 2018-10-02 04:41:20
Message-ID: CAMG7=yUuHqf7M32px=h0gf0Za+TF_NdO1mi5PF5T6HxvRvx8pQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

I am interested in trying the hash partitioning method now available in 11
(trying the beta 4). However, I have the data already hashed at the
application level across multiple postgres instances. If possible, I would
like to keep these two hashing methods same. This would enable me to move
a single partition (considering we have attach/detach methods available)
from one instance to another and have queries work seamlessly.

The application can control what data/query is sent to each instance - the
only thing making this setup impossible is (a) the built-in HASH function
not available/accessible to me outside of pg context, say, as a C library.
and (b) I don't know how to sub a known hash function (say, murmur, xxhash,
siphash) instead of the builtin hash function. I am not particularly
insisting on any particular hash function, except for it to available
outside of postgres (say as a C or Go library).

Based on a discussion in the IRC channel, I was told I could try creating a
custom hash function (postgres extension) and use it in RANGE or LIST
partitioning with that expression.

I have the following code installed as a postgres extension
http://dpaste.com/2594KWM, takes an implementation of xxhash.c and sticks
it in as a postgres extension †
~/tmp/pge$ gcc -fPIC -I/usr/include/postgresql/11/server -c pge.c && gcc
-shared -o pge.so pge.o

Problem is that with this setup, partitioning of the writes/inserts work
fine, but I don't see pg excluding the unnecessary partitions on
read/queries

http://dpaste.com/1C0XY3M

This setup based on expression also has other issues - I can't mark k as a
primary key or have a unique key on that column. If it is declared as a
hash partition, I can have those.

This question may be related to this thread as well
https://www.postgresql-archive.org/Hash-Functions-td5961137.html

† surprisingly, this function works even if I don't compile in xxhash.c/o
into the .so - that is yet another side mystery to figure out. It is like
the symbol XXH64 is already available dynamically. I did have plpython
installed at some point. but this is a separate issue.

--
Harry

Responses

Browse pgsql-general by date

  From Date Subject
Next Message magodo 2018-10-02 05:46:27 how to identify the timeline of specified recovery_target_time when do multiple PITR
Previous Message Tom Lane 2018-10-01 23:57:08 Re: Postgres 11 procedures and result sets