Re: Query planning read a large amount of buffers for partitioned tables

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: bruno vieira da silva <brunogiovs(at)gmail(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Query planning read a large amount of buffers for partitioned tables
Date: 2025-01-15 20:01:33
Message-ID: CAApHDvqCEWF-3Eg=Lmrv3SGFuFX0KzHf8XfZT22Q1XL=8KN0UA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, 16 Jan 2025 at 07:29, bruno vieira da silva
<brunogiovs(at)gmail(dot)com> wrote:
> On pg 17 now we have better visibility on the I/O required during query planning.
> so, as part of an ongoing design work for table partitioning I was analyzing the performance implications of having more or less partitions.
> In one of my tests of a table with 200 partitions using explain showed a large amount of buffers read during planning. around 12k buffers.

That's a suspiciously high number of buffers.

> I observed that query planning seems to have a caching mechanism as subsequent similar queries require only a fraction of buffers read during query planning.
> However, this "caching" seems to be per session as if I end the client session and I reconnect the same query execution will require again to read 12k buffer for query planning.
>
> Does pg have any mechanism to mitigate this issue ( new sessions need to read a large amount of buffers for query planning) ? or should I mitigate this issue by the use of connection pooling.
> How is this caching done? Is there a way to have viability on its usage? Where is it stored?

The caching is for relation meta-data and for various catalogue data.
This is stored in local session hash tables. The caching is done
lazily the first time something is looked up after the session starts.
If you're doing very little work before ending the session, then
you'll pay this overhead much more often than you would if you were to
do more work in each session. A connection pooler would help you do
that, otherwise it would need to be a redesign of how you're
connecting to Postgres from your application.

There's no easy way from EXPLAIN to see which tables or catalogue
tables the IO is occurring on, however, you might want to try looking
at pg_statio_all_tables directly before and after the query that's
causing the 12k buffer accesses and then look at what's changed.

I suspect if you're accessing 12k buffers to run EXPLAIN that you have
some auto-vacuum starvation issues. Is auto-vacuum enabled and
running? If you look at pg_stat_activity, do you see autovacuum
running? It's possible that it's running but not configured to run
quickly enough to keep up with demand. Alternatively, it may be
keeping up now, but at some point in the past, it might not have been
and you have some bloat either in an index or in a catalogue table as
a result.

David

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message James Addison 2025-01-15 23:58:59 Re: Question: consolidating strpos searches?
Previous Message bruno vieira da silva 2025-01-15 18:32:27 Re: Query planning read a large amount of buffers for partitioned tables