Re: When to use PARTITION BY HASH?

From: Oleksandr Shulgin <oleksandr(dot)shulgin(at)zalando(dot)de>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: When to use PARTITION BY HASH?
Date: 2020-06-03 07:45:48
Message-ID: CACACo5QUUG0KoTq8uFjtW9Bpqr7d8s=dGDdjxq0R+-YfNs6KPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

On Tue, Jun 2, 2020 at 7:33 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:

> > To: pgsql-general(at)lists(dot)postgresql(dot)org,
> pgsql-performance(at)lists(dot)postgresql(dot)org
>
> Please don't cross post to multiple lists.
>
> On Tue, Jun 02, 2020 at 07:17:11PM +0200, Oleksandr Shulgin wrote:
> > I was reading up on declarative partitioning[1] and I'm not sure what
> could
> > be a possible application of Hash partitioning.
>
> It's a good question. See Tom's complaint here.
> https://www.postgresql.org/message-id/31605.1586112900%40sss.pgh.pa.us
>
> It *does* provide the benefit of smaller indexes and smaller tables, which
> might allow seq scans to outpeform index scans.
>
> It's maybe only useful for equality conditions on the partition key, and
> not
> for ranges. Here, it scans a single partition:
>
> postgres=# CREATE TABLE t(i int) PARTITION BY HASH(i); CREATE TABLE t1
> PARTITION OF t FOR VALUES WITH (REMAINDER 0, MODULUS 3);
> postgres=# CREATE TABLE t2 PARTITION OF t FOR VALUES WITH (MODULUS 3,
> REMAINDER 1);
> postgres=# CREATE TABLE t3 PARTITION OF t FOR VALUES WITH (MODULUS 3,
> REMAINDER 2);
> postgres=# INSERT INTO t SELECT i%9 FROM generate_series(1,9999)i; ANALYZE
> t;
> postgres=# explain analyze SELECT * FROM t WHERE i=3;
> Seq Scan on t2 (cost=0.00..75.55 rows=2222 width=4) (actual
> time=0.021..0.518 rows=2222 loops=1)
> Filter: (i = 3)
> Rows Removed by Filter: 2222
>

I see. So it works with low cardinality in the partitioned column. With
high cardinality an index scan on an unpartitioned table would be
preferable I guess.

The documentation page I've linked only contains examples around
partitioning BY RANGE. I believe it'd be helpful to extend it with some
meaningful examples for LIST and HASH partitioning.

Regards,
--
Alex

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-06-03 09:00:36 Re: canceling statement due to conflict with recovery after pg_basebackup
Previous Message Oleksandr Shulgin 2020-06-03 07:38:49 Re: When to use PARTITION BY HASH?

Browse pgsql-performance by date

  From Date Subject
Next Message Oleksandr Shulgin 2020-06-03 11:55:25 Re: When to use PARTITION BY HASH?
Previous Message Oleksandr Shulgin 2020-06-03 07:38:49 Re: When to use PARTITION BY HASH?