From: | MichaelDBA <MichaelDBA(at)sqlexec(dot)com> |
---|---|
To: | Imre Samu <pella(dot)samu(at)gmail(dot)com> |
Cc: | Oleksandr Shulgin <oleksandr(dot)shulgin(at)zalando(dot)de>, pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Re: When to use PARTITION BY HASH? |
Date: | 2020-06-07 11:41:28 |
Message-ID: | 1de60053-ac43-87f3-288a-55b3f31d67f7@sqlexec.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-performance |
The article referenced below assumes a worst case scenario for
bulk-loading with hash partitioned tables. It assumes that the values
being inserted are in strict ascending or descending order with no gaps
(like a sequence number incrementing by 1), thereby ensuring every
partition is hit in order before repeating the process. If the values
being inserted are not strictly sequential with no gaps, then the
performance is much better. Obviously, what part of the tables and
indexes are in memory has a lot to do with it as well.
Regards,
Michael Vitale
Imre Samu wrote on 6/5/2020 7:48 AM:
> > "Bulk loads ...",
>
> As I see - There is an interesting bulkload benchmark:
>
> "How Bulkload performance is affected by table partitioning in
> PostgreSQL" by Beena Emerson (Enterprisedb, December 4, 2019 )
> /SUMMARY: This article covers how benchmark tests can be used to
> demonstrate the effect of table partitioning on performance. Tests
> using range- and hash-partitioned tables are compared and the reasons
> for their different results are explained:
> 1. Range partitions
> 2. Hash partitions
> 3. Combination graphs
> 4. Explaining the behavior
> 5. Conclusion/
> /
> /
> /"For the hash-partitioned table, the first value is inserted in the
> first partition, the second number in the second partition and so on
> till all the partitions are reached before it loops back to the first
> partition again until all the data is exhausted. Thus it exhibits the
> worst-case scenario where the partition is repeatedly switched for
> every value inserted. As a result, the number of times the partition
> is switched in a range-partitioned table is equal to the number of
> partitions, while in a hash-partitioned table, the number of times the
> partition has switched is equal to the amount of data being inserted.
> This causes the massive difference in timing for the two partition
> types."/
>
> https://www.enterprisedb.com/postgres-tutorials/how-bulkload-performance-affected-table-partitioning-postgresql
>
> Regards,
> Imre
>
From | Date | Subject | |
---|---|---|---|
Next Message | Wenjun Che | 2020-06-07 13:06:43 | Re: Question on full vacuum clearing waste space |
Previous Message | Mohamed Wael Khobalatte | 2020-06-07 04:31:49 | Re: Question on full vacuum clearing waste space |
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2020-06-07 21:23:10 | Re: When to use PARTITION BY HASH? |
Previous Message | Ron | 2020-06-07 01:58:08 | Re: When to use PARTITION BY HASH? |