From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
---|---|
To: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: tablesync patch broke the assumption that logical rep depends on? |
Date: | 2017-04-13 17:31:32 |
Message-ID: | CAHGQGwH2-Vp5tfZjhdhGx_Acs7kdPdWawOGw-ZPTS9d0i3z5sw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Apr 14, 2017 at 1:28 AM, Peter Eisentraut
<peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
> On 4/10/17 13:28, Fujii Masao wrote:
>> src/backend/replication/logical/launcher.c
>> * Worker started and attached to our shmem. This check is safe
>> * because only launcher ever starts the workers, so nobody can steal
>> * the worker slot.
>>
>> The tablesync patch enabled even worker to start another worker.
>> So the above assumption is not valid for now.
>>
>> This issue seems to cause the corner case where the launcher picks up
>> the same worker slot that previously-started worker has already picked
>> up to start another worker.
>
> I think what the comment should rather say is that workers are always
> started through logicalrep_worker_launch() and worker slots are always
> handed out while holding LogicalRepWorkerLock exclusively, so nobody can
> steal the worker slot.
>
> Does that make sense?
No unless I'm missing something.
logicalrep_worker_launch() picks up unused worker slot (slot's proc == NULL)
while holding LogicalRepWorkerLock. But it releases the lock before the slot
is marked as used (i.e., slot is set to non-NULL). Then newly-launched worker
calls logicalrep_worker_attach() and marks the slot as used.
So if another logicalrep_worker_launch() starts after LogicalRepWorkerLock
is released before the slot is marked as used, it can pick up the same slot
because that slot looks unused.
Regards,
--
Fujii Masao
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2017-04-13 17:35:50 | Re: bugfix: xpath encoding issue |
Previous Message | Tom Lane | 2017-04-13 17:27:39 | Re: Re: Query fails when SRFs are part of FROM clause (Commit id: 69f4b9c85f) |