Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, Noah Misch <noah(at)leadboat(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com>, Denis Laxalde <denis(dot)laxalde(at)dalibo(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
Subject: Re: [EXTERNAL] Re: Add non-blocking version of PQcancel
Date: 2024-07-18 01:06:48
Message-ID: CA+hUKGL6SuCQogCYoRJnfA3zaN_9gh1cJ4AZ_1d0Z1EAqbq-_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 18, 2024 at 7:00 AM Alexander Lakhin <exclusion(at)gmail(dot)com> wrote:
> As far as I can see (having analyzed a number of runs), the hanging occurs
> when some itimer-related activity happens before "peek_socket" in this
> event sequence:
> [main] postgres {pid} select_stuff::wait: res after verify 0
> [main] postgres {pid} select_stuff::wait: returning 0
> [main] postgres {pid} select: sel.wait returns 0
> [main] postgres {pid} peek_socket: read_ready: 0, write_ready: 1, except_ready: 0
>
> (See the last occurrence of the sequence in the log.)

Yeah, right, there's a lot going on between those two lines from the
[main] thread. There are messages from helper threads [itimer], [sig]
and [socksel]. At a guess, [socksel] might be doing extra secret
communication over our socket in order to exchange SO_PEERCRED
information, huh, is that always there? Seems worth filing a bug
report.

For the record, I know of one other occasional test failure on Cygwin:
it randomly panics in SnapBuildSerialize(). While I don't expect
there to be any users of PostgreSQL on Cygwin (it was unusably broken
before we refactored the postmaster in v16), that one is interesting
because (1) it also happen on native Windows builds, and (2) at least
one candidate fix[1] sounds like it would speed up logical replication
on all operating systems.

[1] https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BJ4jSFk%3D-hdoZdcx%2Bp7ru6xuipzCZY-kiKoDc2FjsV7g%40mail.gmail.com#afb5dc4208cc0776a060145f9571dec2

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2024-07-18 01:17:02 Re: Wrong results with grouping sets
Previous Message Michael Paquier 2024-07-18 00:55:27 Re: Injection points: preloading and runtime arguments