From: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
---|---|
To: | Benoit Lobréau <benoit(dot)lobreau(at)dalibo(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Nitin Motiani <nitinmotiani(at)google(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Subject: | RE: long-standing data loss bug in initial sync of logical replication |
Date: | 2025-03-03 07:41:08 |
Message-ID: | OS0PR01MB571616A2C303FCED3CF3D67C94C92@OS0PR01MB5716.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Friday, February 28, 2025 4:28 PM Benoit Lobréau <benoit(dot)lobreau(at)dalibo(dot)com> wrote:
>
> It took me a while but I ran the test on my laptop with 20 runs per test. I asked
> for a dedicated server and will re-run the tests if/when I have it.
>
> count of partitions | Head (sec) | Fix (sec) | Degradation (%)
> ----------------------------------------------------------------------
> 1000 | 0,0265 | 0,028 | 5,66037735849054
> 5000 | 0,091 | 0,0945 | 3,84615384615385
> 10000 | 0,1795 | 0,1815 | 1,11420612813371
>
> Concurrent Txn | Head (sec) | Patch (sec) | Degradation in %
> ---------------------------------------------------------------------
> 50 | 0,1797647 | 0,1920949 | 6,85907744957
> 100 | 0,3693029 | 0,3823425 | 3,53086856344
> 500 | 1,62265755 | 1,91427485 | 17,97158617972
> 1000 | 3,01388635 | 3,57678295 | 18,67676928162
> 2000 | 7,0171877 | 6,4713304 | 8,43500897435
>
> I'll try to run test2.pl later (right now it fails).
>
> hope this helps.
Thank you for testing and sharing the data!
A nitpick with the data for the Concurrent Transaction (2000) case. The results
show that the HEAD's data appears worse than the patch data, which seems
unusual. However, I confirmed that the details in the attachment are as expected,
so, this seems to be a typo. (I assume you intended to use a
decimal point instead of a comma in the data like (8,43500...))
The data suggests some regression, slightly more than Shlok’s findings, but it
is still within an acceptable range for me. Since the test script builds a real
subscription for testing, the results might be affected by network and
replication factors, as Amit pointed out, we will share a new test script soon
that uses the SQL API xxx_get_changes() to test. It would be great if you could
verify the performance using the updated script as well.
Best Regards,
Hou zj
From | Date | Subject | |
---|---|---|---|
Next Message | Tender Wang | 2025-03-03 07:57:31 | Re: Anti join confusion |
Previous Message | Jakub Wartak | 2025-03-03 07:35:58 | Re: doc: Mention clock synchronization recommendation for hot_standby_feedback |