Re: partitioning question

From: Alex Samad <alex(at)samad(dot)com(dot)au>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: partitioning question
Date: 2017-07-31 00:25:54
Message-ID: CAJ+Q1PVUQ1ZF8uXW4_+qD4xsVUfvaFKC1U9AoWKOsQDsHYPndA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi

I note that you link to P10 and I am currently looking at 9.6. The changes
do look nice for partitioning for p10.

Interesting your suggest that the YYYYMM parition isn't that bad.

I will add currently we don't delete anything, we will keep adding to it.

Also I am thinking my insert trigger becomes a lot smaller and easier if I
leave it at yearly.

Also thinking if P10 was the current recommended version right now I would
probably look at YYYYMM because it looks like it makes partitioning easier

Alex

On 31 July 2017 at 09:54, Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:

> On Mon, Jul 31, 2017 at 09:15:29AM +1000, Alex Samad wrote:
> > Hi
> >
> > I was about to partition a large (?) approx 3T of data 2B rows into
> > partition tables but broken up into YYYYMM ...
> >
> > Now I have been reading about limiting the number of partitions otherwise
> > it could slow down the parser.
> >
> > My reasoning for limiting to YYYYMM was that most of the request would be
> > monthly based.
> >
> > Should I be making the partitioning based on YYYY instead and have lots
> > more indexs.
> >
> > If I have an index on the timestamp field will it help limiting to
> YYYYMM ?
>
> The major advantages of partitions are enumerated here:
> https://www.postgresql.org/docs/10/static/ddl-partitioning.html#ddl-
> partitioning-overview
>
> For your case, it might be that seq scans of an entire "monthly" partition
> turn
> out to be very advantageous, compared with index scan (or seq scan of
> entire
> 3TB data).
>
> Also DROPing the oldest partition every month is commonly very much more
> efficient than DELETEing it..
>
> There are warnings like these:
>
> |All constraints on all partitions of the master table are examined during
> |constraint exclusion, so large numbers of partitions are likely to
> increase
> |query planning time considerably. Partitioning using these techniques
> will work
> |well with up to perhaps a hundred partitions; don't try to use many
> thousands
> |of partitions.
>
> Unless you have 100s of years of data I don't think it would be a problem.
>
> For us, having hundreds of partitions hasn't been an issue (planning time
> is
> insignificant for our analytic report queries). But there's an overhead to
> partitions and at some point the cost becomes significant. (Actually, I
> think
> one cost which *did* hit us, while experimenting with *daily* partition
> granularity of every table, was probably due to very large pg_statistics
> and
> pg_attributes tables, which no longer fit in buffer cache).
>
> Justin
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Justin Pryzby 2017-07-31 01:07:55 Re: partitioning question
Previous Message Justin Pryzby 2017-07-30 23:54:04 Re: partitioning question