Re: Query planning read a large amount of buffers for partitioned tables

From: bruno vieira da silva <brunogiovs(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Query planning read a large amount of buffers for partitioned tables
Date: 2025-02-18 15:48:26
Message-ID: CAB+Nuk_rqmZjtVJmTAbmyCeM8-eqB97VkQcTtkYTUksHj0pwjA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello, Your explanation aligns with the idea I had that having more
shared_buffers and connection pooling are very important in the context of
the partitioned tables.

Thanks. Regards.

On Tue, Feb 18, 2025 at 7:16 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:

> On Tue, 18 Feb 2025 at 09:18, bruno vieira da silva
> <brunogiovs(at)gmail(dot)com> wrote:
> >
> > Hello, I did a more comprehensive test with a different number of
> partitions and I found this:
> >
> > Summary buffers usage for the first call vs second call on the same
> session.
> >
> > Query 200, 100, 50, and 10 partitions:
> > 200 Partitions: 12,828 (100MB)
> > 100 Partitions: 9,329 (72MB)
> > 50 Partitions: 3,305 (25MB)
> > 10 Partitions: 875 (7MB)
> >
> > Same query on the same session:
> > 200 Partitions: 205 (1.6MB)
> > 100 Partitions: 5 (40KB)
> > 50 Partitions: 5 (40KB)
> > 10 Partitions: 5 (40KB)
> >
> > I did test on PG 17.3 no relevant changes.
> >
> > Question is, does it make sense?
>
> I didn't analyze this in great detail, but nothing looks too
> surprising to me. I get roughly the same numbers on the latest git
> master branch as you've shown above.
>
> A PostgreSQL backend will cache various metadata about relations the
> first time they're accessed in a backend. Building those caches
> requires accessing the system catalogue tables. I expect the majority
> of the buffer accesses are for those tables. If you're curious about
> what's being accessed and have a fresh test instance handy, you could
> use strace to see which buffers are being read. You'll need to ensure
> the shared buffers are not caching anything. Restarting PostgreSQL
> should clear those out sufficiently. You can translate the filenodes
> back into relation names by using a query such as: select relname from
> pg_class where pg_relation_filenode(oid)=1259;
>
> If this is causing you problems then maybe a connection pooler would
> help you. With one of those, the backend will live longer than just 1
> query. You could also perhaps revisit your partition count to see if
> the number you've chosen gives you the best performance. It's very
> common for people to over-partition and not properly consider the
> overheads of partitioning.
>
> David
>

--
Bruno Vieira da Silva

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Pavel Stehule 2025-02-20 20:22:15 Re: Re: proposal: schema variables
Previous Message David Rowley 2025-02-18 12:16:09 Re: Query planning read a large amount of buffers for partitioned tables