From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | vignesh C <vignesh21(at)gmail(dot)com> |
Cc: | Peter Smith <smithpb2250(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Skip collecting decoded changes of already-aborted transactions |
Date: | 2024-12-10 01:31:09 |
Message-ID: | CAD21AoD38hZHiMGeXbruNomJGSdoZi4hWxPtYJ3fubhvWva3mg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Nov 26, 2024 at 10:01 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> On Tue, 26 Nov 2024 at 02:58, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Mon, Nov 18, 2024 at 11:12 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > >
> > >
> > > Few comments:
> >
> > Thank you for reviewing the patch!
> >
> > > 1) Should we have the Assert inside ReorderBufferTruncateTXNIfAborted
> > > instead of having it at multiple callers, ReorderBufferResetTXN also
> > > has the Assert inside the function after truncate of the transaction:
> > > @@ -3672,6 +3758,14 @@ ReorderBufferCheckMemoryLimit(ReorderBuffer *rb)
> > > Assert(txn->total_size > 0);
> > > Assert(rb->size >= txn->total_size);
> > >
> > > + /* skip the transaction if aborted */
> > > + if (ReorderBufferTruncateTXNIfAborted(rb, txn))
> > > + {
> > > + /* All changes should be discarded */
> > > + Assert(txn->size == 0 && txn->total_size == 0);
> > > + continue;
> > > + }
> > > +
> > > ReorderBufferStreamTXN(rb, txn);
> > > }
> > > else
> > > @@ -3687,6 +3781,14 @@ ReorderBufferCheckMemoryLimit(ReorderBuffer *rb)
> > > Assert(txn->size > 0);
> > > Assert(rb->size >= txn->size);
> > >
> > > + /* skip the transaction if aborted */
> > > + if (ReorderBufferTruncateTXNIfAborted(rb, txn))
> > > + {
> > > + /* All changes should be discarded */
> > > + Assert(txn->size == 0 && txn->total_size == 0);
> > > + continue;
> > > + }
> >
> > Moved.
> >
> > >
> > > 2) txn->txn_flags can be moved to the next line to keep it within 80
> > > chars in this case:
> > > * Check the transaction status by looking CLOG and discard all changes if
> > > * the transaction is aborted. The transaction status is cached in
> > > txn->txn_flags
> > > * so we can skip future changes and avoid CLOG lookups on the next call. Return
> >
> > Fixed.
> >
> > >
> > > 3) Is there any scenario where the Assert can fail as the toast is not reset:
> > > + * Since we don't check the transaction status while replaying the
> > > + * transaction, we don't need to reset toast reconstruction data here.
> > > + */
> > > + ReorderBufferTruncateTXN(rb, txn, rbtxn_prepared(txn), false);
> > >
> > > + if (ReorderBufferTruncateTXNIfAborted(rb, txn))
> > > + {
> > > + /* All changes should be discarded */
> > > + Assert(txn->size == 0 && txn->total_size == 0);
> > > + continue;
> > > + }
> >
> > IIUC we reconstruct TOAST data when replaying the transaction. On the
> > other hand, this function is called while adding a decoded change but
> > not when replaying the transaction. So we should not have any toast
> > reconstruction data at this point unless I'm missing something. Do you
> > have any scenario where we call ReorderBufferTruncateTXNIfAborted()
> > while a transaction has TOAST reconstruction data?
>
> I have checked further regarding the toast and verified the population
> of the toast hash. I agree with you on this. Overall, the patch
> appears to be in good shape.
Thank you for the confirmation!
I thought we'd done performance tests with this patch but Michael-san
pointed out we've not done yet. So I've done benchmark tests in two
scenarios:
A. Skip decoding large aborted transactions.
1. Preparation (SQL commands)
create table test (c int);
select pg_create_logical_replication_slot('s', 'test_decoding');
begin;
insert into test select generate_series(1, 1_000_000);
commit;
begin;
insert into test select generate_series(1, 1_000_000);
rollback;
begin;
insert into test select generate_series(1, 1_000_000);
rollback;
2. Performance tests (results are w/o patch vs. w/ patch)
-- causes some spill/streamed transactions
set logical_decoding_work_mem to '64MB';
select 'non-streaming', count(*) from
pg_logical_slot_peek_changes('s', null, null, 'stream-changes',
'false');
-> 2636.208 ms vs. 2070.906 ms
select 'streaming', count(*) from pg_logical_slot_peek_changes('s',
null, null, 'stream-changes', 'true');
-> 910.579 ms vs. 653.574 ms
-- no spill/streamed transactions
set logical_decoding_work_mem to '5GB';
select 'non-streaming', count(*) from
pg_logical_slot_peek_changes('s', null, null, 'stream-changes',
'false');
-> 962.863 ms vs. 956.910 ms
select 'streaming', count(*) from pg_logical_slot_peek_changes('s',
null, null, 'stream-changes', 'true');
-> 973.426 ms vs. 973.033 ms
According to the results, skipping logical decoding of already-aborted
transactions contributes performance improvements.
B. Decoding medium-size transactions to check overheads of CLOG lookups.
1. Preparation (shell script)
pgbench -i -s 1 postgres
psql -c "create table test (c int)"
psql -c "select pg_create_logical_replication_slot('s', 'test_decoding')"
echo "insert into test select generate_series(1, 100)" > /tmp/bench.sql
pgbench -t 10000 -c 10 -j 5 -f /tmp/bench.sql postgres
2. Performance tests
-- spill/streamed transactions
set logical_decoding_work_mem to '64';
select 'non-streaming', count(*) from
pg_logical_slot_peek_changes('s', null, null, 'stream-changes',
'false');
-> 7230.537 ms vs. 7154.322 ms
select 'streaming', count(*) from pg_logical_slot_peek_changes('s',
null, null, 'stream-changes', 'true');
-> 6702.438 ms vs. 6678.232 ms
Overall, I don't see noticeable overheads of CLOG lookups.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Yan Chengpeng | 2024-12-10 01:33:44 | Re: Incorrect EXPLAIN ANALYZE output in bloom index docs |
Previous Message | Michael Paquier | 2024-12-10 01:14:17 | Fix some comments for GUC hooks of timezone_abbreviations |