Re: Async queries pretending to be synchronous

From: Dave Page <dpage(at)pgadmin(dot)org>
To: Khushboo Vashi <khushboo(dot)vashi(at)enterprisedb(dot)com>
Cc: pgadmin-hackers <pgadmin-hackers(at)postgresql(dot)org>
Subject: Re: Async queries pretending to be synchronous
Date: 2019-03-05 16:43:42
Message-ID: CA+OCxoxb43AWwgyF6on=8cBLi_-Bkcean2Aiu=daVMPYW7YHqw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgadmin-hackers

On Tue, Mar 5, 2019 at 6:36 AM Khushboo Vashi <
khushboo(dot)vashi(at)enterprisedb(dot)com> wrote:

>
>
> On Mon, Mar 4, 2019 at 4:05 PM Khushboo Vashi <
> khushboo(dot)vashi(at)enterprisedb(dot)com> wrote:
>
>>
>>
>> On Mon, Mar 4, 2019 at 4:00 PM Dave Page <dpage(at)pgadmin(dot)org> wrote:
>>
>>>
>>>
>>> On Mon, Mar 4, 2019 at 10:18 AM Khushboo Vashi <
>>> khushboo(dot)vashi(at)enterprisedb(dot)com> wrote:
>>>
>>>>
>>>>
>>>> On Mon, Mar 4, 2019 at 3:43 PM Dave Page <dpage(at)pgadmin(dot)org> wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> On Mon, Mar 4, 2019 at 5:43 AM Khushboo Vashi <
>>>>> khushboo(dot)vashi(at)enterprisedb(dot)com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Mar 1, 2019 at 10:01 PM Dave Page <dpage(at)pgadmin(dot)org> wrote:
>>>>>>
>>>>>>> In investigating #3656 I found the initial problem to be that when
>>>>>>> running in a container, Gunicorn will kill the worker process if a thread
>>>>>>> doesn't respond for 30 seconds by default. I fixed that by making the
>>>>>>> timeout match the application session timeout, but it revealed another
>>>>>>> issue.
>>>>>>>
>>>>>>> Given the function below (from the ticket), if you open the query
>>>>>>> tool and run:
>>>>>>>
>>>>>>> SELECT 1; SELECT fails_after(30);
>>>>>>>
>>>>>>> the async query actually blocks for 30 seconds in the cur.execute()
>>>>>>> call in execute_async() in connection.py (line 968). This causes the entire
>>>>>>> app to hang (watch the dashboard requests pile up in pending state in the
>>>>>>> network tab of the browser dev tools).
>>>>>>>
>>>>>>> If you run just the second SELECT, it works as expected, as does
>>>>>>> running something like:
>>>>>>>
>>>>>>> SELECT 1; SELECT pg_sleep(30);
>>>>>>>
>>>>>>> Anyone have any idea what's going on?
>>>>>>>
>>>>>>
>>>>>> Connection.poll() blocking the call here. ( connection.py - _wait_timeout
>>>>>> function line #1378 state = conn.poll() )
>>>>>> In the asynchronous connection, after executing the query, the
>>>>>> conn.poll() is being called to fetch the connection status.
>>>>>> It gives the status accordingly but in this case, it is blocking and
>>>>>> not giving the status.
>>>>>>
>>>>>
>>>>> If I put a breakpoint on the _wait_timeout call (line 969 in
>>>>> execute_async), it hits it *after* the 30 seconds has passed (during which
>>>>> time all other queries for cancel or dashboards and status etc don't get
>>>>> processed).
>>>>>
>>>>> Put the breakpoint in the _wait_timeout function itself. The line no
>>>> 1378 (state = conn.poll()) is blocking the execution 30 seconds.
>>>>
>>>>> If I walk down the call stack, it returns from the _cursor.execute
>>>>> call in cursor.py and then hangs right before it goes into _wait_timeout.
>>>>>
>>>>
>>> Hmm, yeah - my debugger is doing some weird off-by-one thing. If I put
>>> the break point on the while 1, it blocks there! If I put it on the poll()
>>> call, then it does do as you describe.
>>>
>> It's a psycopg2 call, it should return the specific status but it
>> doesn't, so right now I am digging the Psycopg2 code.
>>
>
> If I remove RAISE statements and then execute SELECT 1; SELECT
> fails_after(30); it works as expected.
>
> As per the documentation, (Ref:
> http://initd.org/psycopg/docs/connection.html)
> Connection.poll returns one of the constants defined in Poll constants
> <http://initd.org/psycopg/docs/extensions.html#poll-constants>. If it
> returns POLL_OK
> <http://initd.org/psycopg/docs/extensions.html#psycopg2.extensions.POLL_OK> then
> the connection has been established or the query results are available on
> the client. Otherwise wait until the file descriptor returned by fileno()
> <http://initd.org/psycopg/docs/connection.html#connection.fileno> is
> ready to read or to write.
>
> So, in case of Raise no file descriptor to read or write, and so before
> raising the notice, It will block the connection till the pg_sleep
> completes.
>

Thanks - confirmed, and created a test case for a bug report:
https://github.com/psycopg/psycopg2/issues/856

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgadmin-hackers by date

  From Date Subject
Next Message Fahar Abbas 2019-03-07 09:42:08 Re: pgAdmin4 v4.3 candidate builds
Previous Message Dave Page 2019-03-05 14:08:22 pgAdmin 4 commit: String review and subsequent cleanup