From: | Peter Smith <smithpb2250(at)gmail(dot)com> |
---|---|
To: | Ajin Cherian <itsajin(at)gmail(dot)com> |
Cc: | "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Rahila Syed <rahila(dot)syed(at)2ndquadrant(dot)com>, Euler Taveira <euler(at)timbira(dot)com(dot)br>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com> |
Subject: | Re: logical replication empty transactions |
Date: | 2021-07-30 05:40:52 |
Message-ID: | CAHut+PuyqcDJO0X2BxY+9ycF+ew3x77FiCbTJQGnLDbNmMASZQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Ajin.
I have spent some time studying how your "empty transaction" (v11)
patch will affect network traffic and transaction throughput.
BLUF
====
For my test environment the general observations with the patch applied are:
- There is a potentially large reduction of network traffic (depends
on the number of empty transactions sent)
- Transaction throughput improved up to 7% (average ~2% across
mixtures) for Synchronous mode
- Transaction throughput improved up to 7% (average ~3% across
mixtures) for NOT Synchronous mode
So this patch LGTM.
TEST INFORMATION
================
Overview
-------------
1. There are 2 similar tables. One table is published; the other is not.
2. Equivalent simple SQL operations are performed on these tables. E.g.
- INSERT/UPDATE/DELETE using normal COMMIT
- INSERT/UPDATE/DELETE using 2PC COMMIT PREPARED
3. pg_bench is used to measure the throughput for different mixes of
empty and not-empty transactions sent. E.g.
- 0% are empty
- 25% are empty
- 50% are empty
- 75% are empty
- 100% are empty
4. The apply_dispatch code has been temporarily modified to log the
number of protocol messages/bytes being processed.
- At the conclusion of the test run the logs are processed to extract
the numbers.
5. Each test run is 15 minutes elapsed time.
6. The tests are repeated without/with your patch applied
- So, there are 2 (without/with patch) x 5 (different mixes) = 10 test results
- Transaction throughput results are from pg_bench
- Protocol message bytes are extracted from the logs (from modified
apply_dispatch)
7. Also, the entire set of 10 test cases was repeated with
synchronous_standby_names setting enable/disabled.
- Enabled, so the results are for total round-trip processing of the pub/sub.
- Disabled. no waiting at the publisher side.
Configuration
-------------------
My environment is a single test machine with 2 PG instances (for pub and sub).
Using default configs except:
PUB-node
- wal_level = logical
- max_wal_senders = 10
- logical_decoding_work_mem = 64kB
- checkpoint_timeout = 30min
- min_wal_size = 10GB
- max_wal_size = 20GB
- shared_buffers = 2GB
- synchronous_standby_names = 'sync_sub' (for synchronous testing only)
SUB-node
- max_worker_processes = 11
- max_logical_replication_workers = 10
- checkpoint_timeout = 30min
- min_wal_size = 10GB
- max_wal_size = 20GB
- shared_buffers = 2GB
SQL files
-------------
Contents of test_empty_not_published.sql:
-- Operations for table not published
BEGIN;
INSERT INTO test_tab_nopub VALUES(1, 'foo');
UPDATE test_tab_nopub SET b = 'bar' WHERE a = 1;
DELETE FROM test_tab_nopub WHERE a = 1;
COMMIT;
-- 2PC operations for table not published
BEGIN;
INSERT INTO test_tab_nopub VALUES(2, 'fizz');
UPDATE test_tab_nopub SET b = 'bang' WHERE a = 2;
DELETE FROM test_tab_nopub WHERE a = 2;
PREPARE TRANSACTION 'gid_nopub';
COMMIT PREPARED 'gid_nopub';
~~
Contents of test_empty_published.sql:
(same as above but the table is called test_tab)
SQL Tables
----------------
(tables are the same apart from the name)
CREATE TABLE test_tab (a int primary key, b text, c timestamptz
DEFAULT now(), d bigint DEFAULT 999);
CREATE TABLE test_tab_nopub (a int primary key, b text, c timestamptz
DEFAULT now(), d bigint DEFAULT 999);
Example pg_bench command
------------------------
(this example is showing a test for a 25% mix of empty transactions)
pgbench -s 100 -T 900 -c 1 -f test_empty_not_published(dot)sql(at)5 -f
test_empty_published(dot)sql(at)15 test_pub
RESULTS / OBSERVATIONS
======================
Synchronous Mode
----------------
- As the percentage mix of empty transactions increases, so does the
transaction throughput. I assume this is because we are using
synchronous mode; so when there is less waiting time, then there is
more time available for transaction processing
- The performance was generally similar before/after the patch, but
there was an observed throughput improvement of ~2% (averaged across
all mixes)
- The number of protocol bytes is associated with the number of
transactions that are processed during the test time of 15 minutes.
This adds up to a significant number of bytes even when the
transactions are empty.
- For the unpatched code as the transaction rate increases, then so
does the number of traffic bytes.
- The patch improves this significantly by eliminating all the empty
transaction traffic.
- Before the patch, even "empty transactions" are processing some
bytes, so it can never reach zero. After the patch, empty transaction
traffic is eliminated entirely.
NOT Synchronous Mode
--------------------
- Since there is no synchronous waiting for round trips, the
transaction throughput is generally consistent regardless of the empty
transaction mix.
- There is a hint of a small overall improvement in throughput as the
empty transaction mix approaches near 100%. For my test environment
both the pub/sub nodes are using the same machine/CPU, so I guess is
that when there is less CPU spent processing messages in the Apply
Worker then there is more CPU available to pump transactions at the
publisher side.
- The patch transaction throughput seems ~3% better than for
non-patched. This might also be attributable to the same reason
mentioned above - less CPU spent processing empty messages at the
subscriber side leaves more CPU available to pump transactions from
the publisher side.
- The number of protocol bytes is associated with the number of
transactions that are processed during the test time of 15 minutes.
- Because the transaction throughput is consistent, the traffic of
protocol bytes here is determined mainly by the proportion of "empty
transactions" in the mixture.
- Before the patch, even “empty transactions” are processing some
bytes, so it can never reach zero. After the patch, the empty
transaction traffic is eliminated entirely.
- Before the patch, even “empty transactions” are processing some
bytes, so it can never reach zero. After the patch, the empty
transaction traffic is eliminated entirely.
ATTACHMENTS
===========
PSA
A1. A PDF version of my test report (also includes raw result data)
A2. Sync: Graph of Transaction throughput
A3. Sync: Graph of Protocol bytes (total)
A4. Sync: Graph of Protocol bytes (per transaction)
A5. Not-Sync: Graph of Transaction throughput
A6. Not-Sync: Graph of Protocol bytes (total)
A7. Not-Sync: Graph of Protocol bytes (per transaction)
------
Kind Regards,
Peter Smith.
Fujitsu Australia.
Attachment | Content-Type | Size |
---|---|---|
PS-empty-tx-testing-15min.pdf | application/pdf | 316.5 KB |
image/png | 19.8 KB | |
image/png | 17.8 KB | |
image/png | 14.9 KB | |
image/png | 18.6 KB | |
image/png | 22.0 KB | |
image/png | 21.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Yugo NAGATA | 2021-07-30 05:43:43 | Re: Fix around conn_duration in pgbench |
Previous Message | vignesh C | 2021-07-30 05:18:25 | Re: [HACKERS] logical decoding of two-phase transactions |