From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Antonin Houska <ah(at)cybertec(dot)at> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Race conditions in shm_mq.c |
Date: | 2015-08-06 18:46:17 |
Message-ID: | CA+TgmobfiUgEVLAeuXn5N11yqPiwDEtdMaNKtwXpEP+abxxuOA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Aug 6, 2015 at 2:38 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Aug 6, 2015 at 10:10 AM, Antonin Houska <ah(at)cybertec(dot)at> wrote:
>> During my experiments with parallel workers I sometimes saw the "master" and
>> worker process blocked. The master uses shm queue to send data to the worker,
>> both sides nowait==false. I concluded that the following happened:
>>
>> The worker process set itself as a receiver on the queue after
>> shm_mq_wait_internal() has completed its first check of "ptr", so this
>> function left sender's procLatch in reset state. But before the procLatch was
>> reset, the receiver still managed to read some data and set sender's procLatch
>> to signal the reading, and eventually called its (receiver's) WaitLatch().
>>
>> So sender has effectively missed the receiver's notification and called
>> WaitLatch() too (if the receiver already waits on its latch, it does not help
>> for sender to call shm_mq_notify_receiver(): receiver won't do anything
>> because there's no new data in the queue).
>>
>> Below is my patch proposal.
>
> Another good catch. However, I would prefer to fix this without
> introducing a "continue" as I think that will make the control flow
> clearer. Therefore, I propose the attached variant of your idea.
Err, that doesn't work at all. Have a look at this version instead.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment | Content-Type | Size |
---|---|---|
fix-shm-mq-attach-race-v2.patch | text/x-diff | 1.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2015-08-06 19:14:42 | Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows |
Previous Message | Robert Haas | 2015-08-06 18:38:11 | Re: Race conditions in shm_mq.c |