Re: When to use PARTITION BY HASH?

From: Oleksandr Shulgin <oleksandr(dot)shulgin(at)zalando(dot)de>
To: Michel Pelletier <pelletier(dot)michel(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: When to use PARTITION BY HASH?
Date: 2020-06-03 11:55:25
Message-ID: CACACo5RxgQBO_cB97yFkWwLo_-Vs3WtUei2KszGg2LkJ95uiXA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

(sticking to pgsql-general)

On Tue, Jun 2, 2020 at 7:45 PM Michel Pelletier <pelletier(dot)michel(at)gmail(dot)com>
wrote:

>
> On Tue, Jun 2, 2020 at 10:17 AM Oleksandr Shulgin <
> oleksandr(dot)shulgin(at)zalando(dot)de> wrote:
>
>>
>> I was reading up on declarative partitioning[1] and I'm not sure what
>> could be a possible application of Hash partitioning.
>>
>> Is anyone actually using it? What are typical use cases? What benefits
>> does such a partitioning scheme provide?
>>
>> On its face, it seems that it can only give you a number of tables which
>> are smaller than the un-partitioned one, but I fail to see how it would
>> provide any of the potential advantages listed in the documentation.
>>
>
>

> From my point of view, hash partitioning is very useful for spreading out
> high insert/update load.
>

Do you also assign the partitions to different tablespaces as you've
hinted below or do you see performance improvement from partitioning
alone? How does that work? Does it give better results than using a RAID
to spread the disk IO, for example?

Yes its' true you end up with more smaller tables than one big large one,
> but remember the indexes are (often) tree data structures. Smaller trees
> are faster than bigger trees. By making the indexes smaller they are
> faster. Since the planner can knows to only examine the specific index it
> needs, this ends up being a lot faster.
>

That sounds logical, but can it be demonstrated? If the index(es) fit in
memory fully, it doesn't make a measurable difference, I guess?

With hash partitioning you are not expected, in general, to end up with a
small number of partitions being accessed more heavily than the rest. So
your indexes will also not fit into memory.

I have the feeling that using a hash function to distribute rows simply
contradicts the basic assumption of when you would think of partitioning
your table at all: that is to make sure the most active part of the table
and indexes is small enough to be cached in memory.

Regards,
--
Alex

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Laurenz Albe 2020-06-03 12:05:44 Re: Replication conflicts despite hot_standby_feedback = on?
Previous Message Julien Rouhaud 2020-06-03 11:41:23 Re: Replication conflicts despite hot_standby_feedback = on?

Browse pgsql-performance by date

  From Date Subject
Next Message Justin Pryzby 2020-06-03 15:09:59 Re: When to use PARTITION BY HASH?
Previous Message Oleksandr Shulgin 2020-06-03 07:45:48 Re: When to use PARTITION BY HASH?