Re: Improve eviction algorithm in ReorderBuffer

From: Shubham Khanna <khannashubham1197(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Improve eviction algorithm in ReorderBuffer
Date: 2024-03-27 07:44:44
Message-ID: CAHv8Rj+ADN=Ps5RkyAAMhmmCXaufuE0kRnQhUzddBAuttK_mYg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 5, 2024 at 8:50 AM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> On Wed, 28 Feb 2024 at 11:40, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Mon, Feb 26, 2024 at 7:54 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> >
> > A few comments on 0003:
> > ===================
> > 1.
> > +/*
> > + * Threshold of the total number of top-level and sub transactions
> > that controls
> > + * whether we switch the memory track state. While the MAINTAIN_HEAP state is
> > + * effective when there are many transactions being decoded, in many systems
> > + * there is generally no need to use it as long as all transactions
> > being decoded
> > + * are top-level transactions. Therefore, we use MaxConnections as
> > the threshold
> > + * so we can prevent switch to the state unless we use subtransactions.
> > + */
> > +#define REORDER_BUFFER_MEM_TRACK_THRESHOLD MaxConnections
> >
> > The comment seems to imply that MAINTAIN_HEAP is useful for large
> > number of transactions but ReorderBufferLargestTXN() switches to this
> > state even when there is one transaction. So, basically we use the
> > binary_heap technique to get the largest even when we have one
> > transaction but we don't maintain that heap unless we have
> > REORDER_BUFFER_MEM_TRACK_THRESHOLD number of transactions are
> > in-progress. This means there is some additional work when (build and
> > reset heap each time when we pick largest xact) we have fewer
> > transactions in the system but that may not be impacting us because of
> > other costs involved like serializing all the changes. I think once we
> > can try to stress test this by setting
> > debug_logical_replication_streaming to 'immediate' to see if the new
> > mechanism has any overhead.
>
> I ran the test with a transaction having many inserts:
>
> | 5000 | 10000 | 20000 | 100000 | 1000000 | 10000000
> ------- |-----------|------------|------------|--------------|----------------|----------------
> Head | 26.31 | 48.84 | 93.65 | 480.05 | 4808.29 | 47020.16
> Patch | 26.35 | 50.8 | 97.99 | 484.8 | 4856.95 | 48108.89
>
> The same test with debug_logical_replication_streaming= 'immediate'
>
> | 5000 | 10000 | 20000 | 100000 | 1000000 | 10000000
> ------- |-----------|------------|------------|--------------|----------------|----------------
> Head | 59.29 | 115.84 | 227.21 | 1156.08 | 11367.42 | 113986.14
> Patch | 62.45 | 120.48 | 240.56 | 1185.12 | 11855.37 | 119921.81
>
> The execution time is in milliseconds. The column header indicates the
> number of inserts in the transaction.
> In this case I noticed that the test execution with patch was taking
> slightly more time.

I have ran the tests that Vignesh had reported a issue, the test
results with the latest patch is given below:

Without debug_logical_replication_streaming= 'immediate'
Record|10000000 |1000000 |100000 | 20000 | 10000 | 5000
----------|---------------|-------------|-----------|----------|----------|----------
Head |47563.759| 4917.057|478.923|97.28 |50.368 |25.917
Patch |47445.733| 4722.874|472.817|95.15 |48.801 |26.168
%imp |0.248 | 03.949 |01.274 |02.189|03.111 |-0.968

With debug_logical_replication_streaming= 'immediate'
Record| 10000000 | 1000000 | 100000 | 20000 | 10000 | 5000
----------|----------------|--------------|-------------|-----------|-----------|----------
Head |106281.236|10669.992|1073.815|214.287|107.62 |54.947
Patch |103108.673|10603.139|1064.98 |210.229|106.321|54.218
%imp | 02.985 | 0.626  |0.822 |01.893 |01.207 |01.326

The execution time is in milliseconds. The column header indicates the
number of inserts in the transaction. I can notice with the test
result that the issue has been resolved with the new patch.

Thanks and Regards,
Shubham Khanna.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2024-03-27 07:47:10 Re: Adding OLD/NEW support to RETURNING
Previous Message Bertrand Drouvot 2024-03-27 07:13:16 Re: pgsql: Track last_inactive_time in pg_replication_slots.