ANALYZE on partitioned tables vs on individual partitions

From: Michael Harris <harmic(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: ANALYZE on partitioned tables vs on individual partitions
Date: 2024-08-07 04:13:10
Message-ID: CADofcAXVbD0yGp_EaC9chmzsOoSai3jcfBCnyva3j0RRdRvMVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello Experts,

Our application has a database with a large number of partitioned tables
used to store time series data.

It is partitioned by time: new data is populated into the current
partition, old partitions eventually get dropped, and the partitions in
between are largely static.

I had not realized until recently that the partitioned tables themselves
are not analyzed by the autovacuum processes. I am now wondering if we
should be manually analyzing those.

The problem is they are quite enormous and take a long time to analyze - I
tested one yesterday and it took ~30mins, and we have hundreds of them. It
might not be practical to regularly analyze them.

My questions are:

1. What is the consequence of not having good statistics on partitioned
table level, if you do have good statistics on the partition level?

2. Which planning operations rely on partitioned table level statistics?

My reason for asking is I'm trying to understand the tradeoff between the
IO cost associated with frequently analyzing as opposed to possibly
non-optimal plans being chosen.

Thanks in advance, and sorry if this has all been covered before. I could
not find anything definitive in the docs, and while I did find a few
references by googling, none really explained the answers to those
questions too well.

Regards
Mike

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Muhammad Ikram 2024-08-07 04:32:19 Re: Streaming replication issue post upgrade from version 11 to 14 on windows 2016 Server
Previous Message Christophe Pettus 2024-08-07 03:14:22 Re: data checksums