Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, a(dot)kondratov(at)postgrespro(dot)ru, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2019-10-22 17:22:10
Message-ID: 20191022172210.26bdiv44vwvunrh3@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 22, 2019 at 11:01:48AM +0530, Dilip Kumar wrote:
>On Tue, Oct 22, 2019 at 10:46 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>
>> On Thu, Oct 3, 2019 at 1:18 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>> >
>> > I have attempted to test the performance of (Stream + Spill) vs
>> > (Stream + BGW pool) and I can see the similar gain what Alexey had
>> > shown[1].
>> >
>> > In addition to this, I have rebased the latest patchset [2] without
>> > the two-phase logical decoding patch set.
>> >
>> > Test results:
>> > I have repeated the same test as Alexy[1] for 1kk and 1kk data and
>> > here is my result
>> > Stream + Spill
>> > N time on master(sec) Total xact time (sec)
>> > 1kk 6 21
>> > 3kk 18 55
>> >
>> > Stream + BGW pool
>> > N time on master(sec) Total xact time (sec)
>> > 1kk 6 13
>> > 3kk 19 35
>> >
>>
>> I think the test results for the master are missing.
>Yeah, That time, I was planning to compare spill vs bgworker.
> Also, how about
>> running these tests over a network (means master and subscriber are
>> not on the same machine)?
>
>Yeah, we should do that that will show the merit of streaming the
>in-progress transactions.
>

Which I agree it's an interesting feature, I think we need to stop
adding more stuff to this patch series - it's already complex enough, so
making it even more (unnecessary) stuff is a distraction and will make
it harder to get anything committed. Typical "scope creep".

I think the current behavior (spill to file) is sufficient for v0 and
can be improved later - that's fine. I don't think we need to bother
with comparisons to master very much, because while it might be a bit
slower in some cases, you can always disable streaming (so if there's a
regression for your workload, you can undo that).

> In general, yours and Alexy's test results
>> show that there is merit by having workers applying such transactions.
>> OTOH, as noted above [1], we are also worried about the performance
>> of Rollbacks if we follow that approach. I am not sure how much we
>> need to worry about Rollabcks if commits are faster, but can we think
>> of recording the changes in memory and only write to a file if the
>> changes are above a certain threshold? I think that might help saving
>> I/O in many cases. I am not very sure if we do that how much
>> additional workers can help, but they might still help. I think we
>> need to do some tests and experiments to figure out what is the best
>> approach? What do you think?
>I agree with the point. I think we might need to do some small
>changes and test to see what could be the best method to handle the
>streamed changes at the subscriber end.
>
>>
>> Tomas, Alexey, do you have any thoughts on this matter? I think it is
>> important that we figure out the way to proceed in this patch.
>>
>> [1] - https://www.postgresql.org/message-id/b25ce80e-f536-78c8-d5c8-a5df3e230785%40postgrespro.ru
>>
>

I think the patch should do the simplest thing possible, i.e. what it
does today. Otherwise we'll never get it committed.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2019-10-22 17:22:15 Re: Ordering of header file inclusion
Previous Message Tomas Vondra 2019-10-22 17:12:26 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions