Re: Thousands of partitions performance questions

From: Shai Cantor <shaicantor(at)gmail(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Thousands of partitions performance questions
Date: 2019-04-29 07:20:06
Message-ID: CABJyNM49eLiM-PGRSxnqb07gGFxx+gzsEpvpMNSjoX+mu_2Chw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks a lot for your feedback.
We're gonna user PG 11 on AWS RDS.
I will do some workload simulations.
Some notes I haven't shared or were not clear enough on the previous post.

1. *Data is inserted using the copy command only* and in an offline
manner. Meaning, no user action creates or updates the data. An offline job
runs is. Number of inserts can reach up to 1500 a day.
2. *Queries are only on a single partition*

- Can that ease the performance?
- Will querying directly the partition tables help?

On Mon, Apr 29, 2019 at 8:46 AM David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
wrote:

> On Mon, 29 Apr 2019 at 17:13, Shai Cantor <shaicantor(at)gmail(dot)com> wrote:
> > Will the db hold 135000 (45000 * 3 months) partitions under the
> assumption that I query only 1 partition?
> > Should I model it differently in terms of schema, partitions etc.?
>
> Which PG version?
>
> Before PG11 the query planner pruned unneeded partition by looking at
> each partition and determining if the partition constraint could not
> match the base quals on the query. PG11 improved this by adding a
> smarter and faster algorithm to get rid of non-matching partitions,
> however, this really only speeds things up a little as it only really
> allows the planner to skip generating paths for these partitions,
> certain meta-data is still loaded, which is not really slow per
> partition, but it is slow if you have thousands of partitions.
>
> PG11 is also pretty bad at planning UPDATE/DELETEs to partitioned
> tables with a large number of partitions. You'll at best suffer from
> high planning times to plan these queries and at worst suffer out of
> memory errors with that many partitions.
>
> The yet to be released PG12 improves both of these deficiencies
> providing you can eliminate the majority of partitions during query
> planning. PG12 also improves the performance of INSERT into a
> partitioned table, wherein PG11 a lock was obtained on every
> partition, with PG12 we only grab a lock on a partition the first time
> the INSERT command inserts a row into it. If you're just INSERTing 1
> row per command into a partitioned table with many partitions then
> this makes a pretty big difference.
>
> Depending on the types of query you're running it's likely not a good
> idea to go above 100 or so partitions with PG11. You might get away
> with more if you're running a more data-warehouse type load, i.e fewer
> but longer running queries, but for a more OLTP type workload, with
> more queries and lower latencies, then you may struggle to cope with a
> dozen.
>
> I'd recommend you do workload simulations with whatever number you
> choose and ensure performance is to the level you require before
> getting in too deep with your design.
>
> If your go-live date is near the end of the year or beyond, then it
> might be a good idea to start testing with PG12 right away. The
> release date for that will likely be around the middle of October.
>
> --
> David Rowley http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David Rowley 2019-04-29 07:34:37 Re: Thousands of partitions performance questions
Previous Message David Rowley 2019-04-29 05:46:43 Re: Thousands of partitions performance questions