From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION |
Date: | 2023-12-04 12:00:27 |
Message-ID: | c4029baf-693d-4bb5-7c57-5bfcdc5572ff@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 12/4/23 12:37, Amit Kapila wrote:
> On Sat, Dec 2, 2023 at 9:52 PM Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> wrote:
>>
>>> thread. I think you can compare the timing of regression tests in
>>> subscription, with and without the patch to show there is no
>>> regression. And probably some tests with a large number of tables for
>>> sync with very little data.
>>
>> I have tested the regression test timings for subscription with and
>> without patch. I also did the timing test for sync of subscription
>> with the publisher for 100 and 1000 tables respectively.
>> I have attached the test script and results of the timing test are as follows:
>>
>> Time taken for test to run in Linux VM
>> Summary | Subscription Test (sec)
>> | 100 tables in pub and Sub (sec) | 1000 tables in pub and Sub
>> (sec)
>> Without patch Release | 95.564
>> | 7.877 | 58.919
>> With patch Release | 96.513
>> | 6.533 | 45.807
>>
>> Time Taken for test to run in another Linux VM
>> Summary | Subscription Test (sec)
>> | 100 tables in pub and Sub (sec) | 1000 tables in pub and Sub
>> (sec)
>> Without patch Release | 109.8145
>> | 6.4675 | 83.001
>> With patch Release | 113.162
>> | 7.947 | 87.113
>>
>
> So, on some machines, it may increase the test timing although not too
> much. I think the reason is probably doing the work in multiple
> transactions for multiple relations. I am wondering that instead of
> committing and starting a new transaction before
> wait_for_relation_state_change(), what if we do it inside that
> function just before we decide to wait? It is quite possible that in
> many cases we don't need any wait at all.
>
I'm not sure what you mean by "do it". What should the function do?
As for the test results, I very much doubt the differences are not
caused simply by random timing variations, or something like that. And I
don't understand what "Performance Machine Linux" is, considering those
timings are slower than the other two machines.
Also, even if it was a bit slower, does it really matter? I mean, the
current code is wrong, can lead to infinite duration if it happens to
hit the deadlock. And it's a one-time action, I don't think it's a very
sensitive in terms of performance.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Matthias van de Meent | 2023-12-04 12:10:36 | Re: Avoid detoast overhead when possible |
Previous Message | Tatsuo Ishii | 2023-12-04 11:40:48 | Re: Row pattern recognition |