From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: strange parallel query behavior after OOM crashes |
Date: | 2017-04-05 14:18:42 |
Message-ID: | 5b81a91d-7c3a-4b5b-38da-b4efb2ba9acb@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 04/05/2017 04:09 PM, Kuntal Ghosh wrote:
> On Wed, Apr 5, 2017 at 7:31 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Wed, Apr 5, 2017 at 6:36 AM, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com> wrote:
>>> Yes. But, as Robert suggested up in the thread, we should not use
>>> (parallel_register_count = 0) as an alternative to define a bgworker
>>> crash. Hence, I've added an argument named 'wasCrashed' in
>>> ForgetBackgroundWorker to indicate a bgworker crash.
>>
>> Did you intend to attach that patch to this email?
>>
> Actually, I'm confused how we should ensure (register_count >
> terminate_count) invariant. I think there can be a system crash what
> Tomas has suggested up in the thread.
>
> Assert(parallel_register_count - parallel_terminate_count <=
> max_parallel_workers);
> Backend 1 > SET max_parallel_worker = 8;
> Backend 1 > Execute a long running parallel query q1 with number of
> parallel worker spawned is say, 4.
> Backend 2> SET max_parallel_worker = 3;
> Backend 2 > Try to execute any parallel query q2 with number of
> parallel worker spawned > 0.
>
> The above assert statement will bring down the server unnecessarily
> while executing q2. If the assert statement was not there, it could
> have gone ahead without launching any workers.
>
Ah, right. I forgot max_parallel_workers may be changed in session. I
think we can use max_worker_processes instead:
Assert(parallel_register_count - parallel_terminate_count <=
max_worker_processes);
The whole point is that if parallel_terminate_count exceeds
parallel_register_count, the subtraction wraps around to a value close
to UINT_MAX. All we need is an maximum possible delta between the two
values to detect this, and max_worker_processes seems fine. We could
also use UINT_MAX/2 for example.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2017-04-05 14:26:05 | Re: strange parallel query behavior after OOM crashes |
Previous Message | Robert Haas | 2017-04-05 14:18:11 | Re: [sqlsmith] Unpinning error in parallel worker |