Re: When to use PARTITION BY HASH?

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Oleksandr Shulgin <oleksandr(dot)shulgin(at)zalando(dot)de>
Cc: "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: When to use PARTITION BY HASH?
Date: 2020-06-02 17:43:02
Message-ID: CAKFQuwaDAP=sN=ds0HoJeds=T23D5=6Gq1TK8v+F7h6ctGvTvA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

On Tue, Jun 2, 2020 at 10:17 AM Oleksandr Shulgin <
oleksandr(dot)shulgin(at)zalando(dot)de> wrote:

> That *might* turn out to be the case with a small number of distinct
> values in the partitioning column(s), but then why rely on hash
> assignment instead of using PARTITION BY LIST in the first place?
>
> [1] https://www.postgresql.org/docs/12/ddl-partitioning.html
>

Why the cross-posting? (-performance is oriented toward problem solving,
not theory, so -general is the one and only PostgreSQL list this should
have been sent to)

Anyway, quoting the documentation you linked to:

"When choosing how to partition your table, it's also important to consider
what changes may occur in the future. For example, if you choose to have
one partition per customer and you currently have a small number of large
customers, consider the implications if in several years you instead find
yourself with a large number of small customers. In this case, it may be
better to choose to partition by HASH and choose a reasonable number of
partitions rather than trying to partition by LIST and hoping that the
number of customers does not increase beyond what it is practical to
partition the data by."

Hashing does indeed preclude some of the benefits and introduces others.

I suspect that having a hash function that turns its input into a different
output and checking for equality on the output would be better than trying
to "OR" a partition list together in order to combine multiple inputs onto
the same table.

David J.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Michel Pelletier 2020-06-02 17:45:12 Re: When to use PARTITION BY HASH?
Previous Message MichaelDBA 2020-06-02 17:39:40 Re: When to use PARTITION BY HASH?

Browse pgsql-performance by date

  From Date Subject
Next Message Michel Pelletier 2020-06-02 17:45:12 Re: When to use PARTITION BY HASH?
Previous Message MichaelDBA 2020-06-02 17:39:40 Re: When to use PARTITION BY HASH?