Re: Should we warn against using too many partitions?

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Should we warn against using too many partitions?
Date: 2019-06-06 05:29:02
Message-ID: 20190606052902.GX3079@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I suggest just minor variations on language.

On Thu, Jun 06, 2019 at 04:43:48PM +1200, David Rowley wrote:

>diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
>index cce1618fc1..ab26630199 100644
>--- a/doc/src/sgml/ddl.sgml
>+++ b/doc/src/sgml/ddl.sgml
>@@ -4674,6 +4675,76 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2008-01-01';
> </itemizedlist>
> </para>
> </sect2>
>+
>+ <sect2 id="ddl-partitioning-declarative-best-practices">
>+ <title>Declarative Partitioning Best Practices</title>
>+
>+ <para>
>+ The choice of how to partition a table should be considered carefully as

Either say "How to partition consider should be .." or "The choice should MADE carefully" ?

>+ <para>
>+ One of the most critical design decisions will be the column or columns
>+ which you partition your data by. Often the best choice will be to

by which ?

>+ <para>
>+ Choosing the number of partitions to divide the table into is also a

the TARGET number of partitions BY WHICH to divide the table ?

>+ critical decision to make. Not having enough partitions may mean that
>+ indexes remain too large and that data locality remains poor which could
>+ result in poor cache hit ratios. However, dividing the table into too
>+ many partitions can also cause issues. Too many partitions can mean
>+ slower query planning times and higher memory consumption during both
>+ query planning and execution. It's also important to consider what
>+ changes may occur in the future when choosing how to partition your table.
>+ For example, if you choose to have one partition per customer and you
>+ currently have a small number of large customers, what will the

have ONLY ?

>+ implications be if in several years you obtain a large number of small
>+ customers. In this case, it may be better to choose to partition by
>+ <literal>HASH</literal> and choose a reasonable amount of partitions

reasonable NUMBER ?

>+ <para>
>+ It is also important to consider the overhead of partitioning during
>+ query planning and execution. The query planner is generally able to
>+ handle partition hierarchies up a few thousand partitions fairly well,
>+ providing that the vast majority of them can be pruned during query

provided ?

I would say: "provided that typical queries prune all but a small number of
partitions during planning time".

>+ <command>DELETE</command> commands. Also, even if most queries are
>+ able to prune a high number of partitions during query planning, it still

LARGE number?

>+ may be undesirable to have a large number of partitions as each partition

may still ?

>+ also will obtain a relation cache entry in each session which uses the

will require ? Or occupy ?

>+ <para>
>+ With data warehouse type workloads it can make sense to use a larger
>+ number of partitions than with an OLTP type workload. Generally, in data
>+ warehouses, query planning time is less of a concern as the majority of
>+ processing time is generally spent during query execution. With either of

remove the 2nd "generally"

>+ these two types of workload, it is important to make the right decisions
>+ early as re-partitioning large quantities of data can be painstakingly

early COMMA ?

PAINFULLY slow

>+ When performance is critical, performing workload simulations to
>+ assist in making the correct decisions can be beneficial.

I would say:
Simulations of the intended workload are beneficial for optimizing partitioning
strategy.

Thanks,
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2019-06-06 05:46:23 Re: Should we warn against using too many partitions?
Previous Message David Rowley 2019-06-06 04:43:48 Re: Should we warn against using too many partitions?