Re: Question related to partitioning with pg_partman

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: sud <suds1434(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Question related to partitioning with pg_partman
Date: 2024-03-10 20:25:42
Message-ID: 14f28b7a-9ea4-49b2-9000-b88dbe0332ec@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 3/10/24 11:34, sud wrote:
>
> On Sun, Mar 10, 2024 at 11:31 PM Adrian Klaver
> <adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>> wrote:
>
> 1) The partition will be across one day(24 hours) it is just the times
> may confuse people. Per you example 2024-03-07 00:00:00+00  is the same
> time as 2024-03-06 19:00:00-05 for EST. The issue is that the +00 and
> -05 maybe ignored. Also it depends on the clients being consistent in
> using timestamptz.
>
> 2) You still have not answered what the datetime range(not date range)
> is that will be queried. If you have the partitions Midnight to
> Midnight
> UTC and the clients are querying Midnight to Midnight local time the
> query will not match the partitions.
>
>
>  My apology if not able to clearly put the details. Actually, the query
> will always happen on a day basis i.e they can query from one day to 15
> days transactions. But as you rightly pointed , the partitions can only
> span from midnight to midnight in one timezone, and thus users who
> queries the data from another time zone will mostly scan two partitions
> (even if they just queries one days transaction data in their own
> timezone). And I don't see an easy solution for this , which will help
> all users across all time zones to scan only a single partition in the
> database, when they queries data for a single transaction date.
>
> And thus my question was, is it necessary to have the creation of
> partitions to happen on UTC time zone only? and then whatever
> transaction data inserted by the users from respective time zones will
> be stored in the database as is and will be queried based on the user
> timezone (it may span across multiple partitions though for a single
> user transaction date).

This is going to depend on many things.

1) Partitions are not free they have overhead, which is fine if the
cost(overhead) is less then the benefits. For details on that see:

https://www.postgresql.org/docs/current/ddl-partitioning.html

and partition parts of

https://www.postgresql.org/docs/current/sql-createtable.html

As part of this there is the consideration of whether daily partitions
are really what you want?

2) What you hope to get out of the partitioning?

a) If it is confining queries to the partition boundaries then you have
already stated that is not going to happen.

b) If it is for data pruning purposes, then you have something to
consider on both ends. Creating/dropping partitions with Midnight to
Midnight UTC means you will need to consider whether they cover the
range of datetimes that your users are interested in. In other words
creating a partition ahead that covers local times that resolve to a UTC
time in the 'future'. On the back end not dropping a partition until it
has gone out of scope for everybody.

To answer 1 & 2 you are probably going to need to create a test setup
and verify how the expected queries are actually going to work with your
partition scheme.

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Ilya Basin 2024-03-10 20:51:21 Re: Query plans not identical for `id in(1,2,3)` and `(id=1 or id=2 or id=3)`
Previous Message sud 2024-03-10 18:34:02 Re: Question related to partitioning with pg_partman