Quick Links

Re: On-demand running query plans using auto_explain and signals

From:	"Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>
To:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: On-demand running query plans using auto_explain and signals
Date:	2015-09-14 13:09:07
Message-ID:	CACACo5SKOxdPJ54MwNxuK0CdHf7pp3mB5eN-Ha5e4WDg9i1Ksw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Sep 14, 2015 at 2:11 PM, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:

>
>> Now the backend that has been signaled on the second call to
>> pg_cmdstatus (it can be either some other backend, or the backend B
>> again) will not find an unprocessed slot, thus it will not try to
>> attach/detach the queue and the backend A will block forever.
>>
>> This requires a really bad timing and the user should be able to
>> interrupt the querying backend A still.
>>
>
> I think we can't rely on the low probability that this won't happen, and
> we should not rely on people interrupting the backend. Being able to detect
> the situation and fail gracefully should be possible.
>
> It may be possible to introduce some lock-less protocol preventing such
> situations, but it's not there at the moment. If you believe it's possible,
> you need to explain and "prove" that it's actually safe.
>
> Otherwise we may need to introduce some basic locking - for example we may
> introduce a LWLock for each slot, and lock it with dontWait=true (and skip
> it if we couldn't lock it). This should prevent most scenarios where one
> corrupted slot blocks many processes.

OK, I will revisit this part then.

In any case, the backends that are being asked to send the info will be
>> able to notice the problem (receiver detached early) and handle it
>> gracefully.
>>
>
> Ummm, how? Maybe I missed something?

Well, I didn't attach the updated patch (doing that now). The basic idea
is that when the backend that has requested information bails out
prematurely it still detaches from the shared memory queue. This makes it
possible for the backend being asked to detect the situation either before
attaching to the queue or when trying to send the data, so it won't be
blocked forever if the other backend failed to wait.

I don't think we should mix this with monitoring of auxiliary
>> processes. This interface is designed at monitoring SQL queries
>> running in other backends, effectively "remote" EXPLAIN. But those
>> auxiliary processes are not processing SQL queries at all, they're
>> not even using regular executor ...
>>
>> OTOH the ability to request this info (e.g. auxiliary process
>> looking at plans running in backends) seems useful, so I'm ok with
>> tuple slots for auxiliary processes.
>>
>>
>> Now that I think about it, reserving the slots for aux process doesn't
>> let us query their status, it's the other way round. If we don't
>> reserve them, then aux process would not be able to query any other
>> process for the status. Likely this is not a problem at all, so we can
>> remove these extra slots.
>>
>
> I don't know. I can imagine using this from background workers, but I
> think those are counted as regular backends (not sure though).

MaxBackends includes the background workers, yes.

--
Alex

In response to

Re: On-demand running query plans using auto_explain and signals at 2015-09-14 12:11:10 from Tomas Vondra

Responses

Re: On-demand running query plans using auto_explain and signals at 2015-09-14 13:34:10 from Shulgin, Oleksandr
Re: On-demand running query plans using auto_explain and signals at 2015-09-14 16:46:51 from Shulgin, Oleksandr

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Teodor Sigaev	2015-09-14 13:20:07	pgsql: Check existency of table/schema for -t/-n option (pg_dump/pg_res
Previous Message	YUriy Zhuravlev	2015-09-14 13:04:18	Re: Scaling PostgreSQL at multicore Power8