Re: Query planning read a large amount of buffers for partitioned tables

From: bruno vieira da silva <brunogiovs(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Query planning read a large amount of buffers for partitioned tables
Date: 2025-01-16 14:56:31
Message-ID: CAB+Nuk8bW4xcjGEjANy9manpxOpH=0pRFOQRufwzEUvkG_euDQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello, Thanks David.

this pg test deployment. anyways I did a vacuum full on the db. and the
number of buffers read increased a bit.

On Wed, Jan 15, 2025 at 3:01 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:

> On Thu, 16 Jan 2025 at 07:29, bruno vieira da silva
> <brunogiovs(at)gmail(dot)com> wrote:
> > On pg 17 now we have better visibility on the I/O required during query
> planning.
> > so, as part of an ongoing design work for table partitioning I was
> analyzing the performance implications of having more or less partitions.
> > In one of my tests of a table with 200 partitions using explain showed a
> large amount of buffers read during planning. around 12k buffers.
>
> That's a suspiciously high number of buffers.
>
> > I observed that query planning seems to have a caching mechanism as
> subsequent similar queries require only a fraction of buffers read during
> query planning.
> > However, this "caching" seems to be per session as if I end the client
> session and I reconnect the same query execution will require again to read
> 12k buffer for query planning.
> >
> > Does pg have any mechanism to mitigate this issue ( new sessions need to
> read a large amount of buffers for query planning) ? or should I mitigate
> this issue by the use of connection pooling.
> > How is this caching done? Is there a way to have viability on its usage?
> Where is it stored?
>
> The caching is for relation meta-data and for various catalogue data.
> This is stored in local session hash tables. The caching is done
> lazily the first time something is looked up after the session starts.
> If you're doing very little work before ending the session, then
> you'll pay this overhead much more often than you would if you were to
> do more work in each session. A connection pooler would help you do
> that, otherwise it would need to be a redesign of how you're
> connecting to Postgres from your application.
>
> There's no easy way from EXPLAIN to see which tables or catalogue
> tables the IO is occurring on, however, you might want to try looking
> at pg_statio_all_tables directly before and after the query that's
> causing the 12k buffer accesses and then look at what's changed.
>
> I suspect if you're accessing 12k buffers to run EXPLAIN that you have
> some auto-vacuum starvation issues. Is auto-vacuum enabled and
> running? If you look at pg_stat_activity, do you see autovacuum
> running? It's possible that it's running but not configured to run
> quickly enough to keep up with demand. Alternatively, it may be
> keeping up now, but at some point in the past, it might not have been
> and you have some bloat either in an index or in a catalogue table as
> a result.
>
> David
>

--
Bruno Vieira da Silva

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Pavel Stehule 2025-01-17 07:18:20 Re: Re: proposal: schema variables
Previous Message James Addison 2025-01-15 23:58:59 Re: Question: consolidating strpos searches?