From: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, Noah Misch <noah(at)leadboat(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com>, Denis Laxalde <denis(dot)laxalde(at)dalibo(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com> |
Subject: | Re: [EXTERNAL] Re: Add non-blocking version of PQcancel |
Date: | 2024-07-17 19:00:00 |
Message-ID: | f92ce13f-cbad-769d-72df-f5b87717f375@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello Thomas,
17.07.2024 03:05, Thomas Munro wrote:
> On Wed, Jul 17, 2024 at 3:08 AM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>> Ugh. I tried to follow what's going on in that cygwin code, but I gave
>> up pretty quickly. It depends on a mutex, but I didn't see the mutex
>> being defined or initialized anywhere.
> https://github.com/cygwin/cygwin/blob/cygwin-3.5.3/winsup/cygwin/fhandler/socket_inet.cc#L217C1-L217C77
>
> Not obvious how it'd be deadlocking (?), though... it's hard to see
> how anything between LOCK_EVENTS and UNLOCK_EVENTS could escape/return
> early. (Something weird going on with signal handlers? I can't
> imagine where one would call poll() though).
I've simplified the repro to the following:
echo "
-- setup foreign server "loopback" --
CREATE TABLE t1(i int);
CREATE FOREIGN TABLE ft1 (i int) SERVER loopback OPTIONS (table_name 't1');
CREATE FOREIGN TABLE ft2 (i int) SERVER loopback OPTIONS (table_name 't1');
INSERT INTO t1 SELECT i FROM generate_series(1, 100000) g(i);
" | psql
cat << 'EOF' | psql
Select pg_sleep(10);
SET statement_timeout = '10ms';
SELECT 'SELECT count(*) FROM ft1 CROSS JOIN ft2;' FROM generate_series(1, 100)
\gexec
EOF
I've attached strace (with --mask=0x251, per [1]) to the query-cancelling
backend and got strace.log (see in attachment), while observing:
ERROR: canceling statement due to statement timeout
...
ERROR: canceling statement due to statement timeout
-- total 14 lines, then the process hanged --
-- I interrupted it several seconds later --
As far as I can see (having analyzed a number of runs), the hanging occurs
when some itimer-related activity happens before "peek_socket" in this
event sequence:
[main] postgres {pid} select_stuff::wait: res after verify 0
[main] postgres {pid} select_stuff::wait: returning 0
[main] postgres {pid} select: sel.wait returns 0
[main] postgres {pid} peek_socket: read_ready: 0, write_ready: 1, except_ready: 0
(See the last occurrence of the sequence in the log.)
[1] https://cygwin.com/cygwin-ug-net/strace.html
Best regards,
Alexander
Attachment | Content-Type | Size |
---|---|---|
strace.log | text/x-log | 348.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2024-07-17 19:05:03 | Re: improve performance of pg_dump with many sequences |
Previous Message | Tom Lane | 2024-07-17 18:59:26 | Re: improve performance of pg_dump with many sequences |