From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Using per-transaction memory contexts for storing decoded tuples |
Date: | 2024-09-23 21:36:09 |
Message-ID: | CAD21AoB82XWj1QMzX-k=UkMVJy=zwzZuHNNmxg_v2kJ-sAQy2Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Sep 19, 2024 at 10:44 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Sep 19, 2024 at 10:33 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Wed, Sep 18, 2024 at 8:55 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Thu, Sep 19, 2024 at 6:46 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> > > >
> > > > On Thu, 19 Sept 2024 at 11:54, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > > I've done some benchmark tests for three different code bases with
> > > > > different test cases. In short, reducing the generation memory context
> > > > > block size to 8kB seems to be promising; it mitigates the problem
> > > > > while keeping a similar performance.
> > > >
> > > > Did you try any sizes between 8KB and 8MB? 1000x reduction seems
> > > > quite large a jump. There is additional overhead from having more
> > > > blocks. It means more malloc() work and more free() work when deleting
> > > > a context. It would be nice to see some numbers with all powers of 2
> > > > between 8KB and 8MB. I imagine the returns are diminishing as the
> > > > block size is reduced further.
> > > >
> > >
> > > Good idea.
> >
> > Agreed.
> >
> > I've done other benchmarking tests while changing the memory block
> > sizes from 8kB to 8MB. I measured the execution time of logical
> > decoding of one transaction that inserted 10M rows. I set
> > logical_decoding_work_mem large enough to avoid spilling behavior. In
> > this scenario, we allocate many memory chunks while decoding the
> > transaction and resulting in calling more malloc() in smaller memory
> > block sizes. Here are results (an average of 3 executions):
> >
> > 8kB: 19747.870 ms
> > 16kB: 19780.025 ms
> > 32kB: 19760.575 ms
> > 64kB: 19772.387 ms
> > 128kB: 19825.385 ms
> > 256kB: 19781.118 ms
> > 512kB: 19808.138 ms
> > 1MB: 19757.640 ms
> > 2MB: 19801.429 ms
> > 4MB: 19673.996 ms
> > 8MB: 19643.547 ms
> >
> > Interestingly, there were no noticeable differences in the execution
> > time. I've checked the number of allocated memory blocks in each case
> > and more blocks are allocated in smaller block size cases. For
> > example, when the logical decoding used the maximum memory (about
> > 1.5GB), we allocated about 80k blocks in 8kb memory block size case
> > and 80 blocks in 8MB memory block cases.
> >
>
> What exactly do these test results mean? Do you want to prove that
> there is no regression by using smaller block sizes?
Yes, there was no noticeable performance regression at least in this
test scenario.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2024-09-23 22:07:23 | Re: Increase of maintenance_work_mem limit in 64-bit Windows |
Previous Message | Masahiko Sawada | 2024-09-23 21:05:17 | Re: Conflict detection for update_deleted in logical replication |