From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: REL_15_STABLE: pgbench tests randomly failing on CI, Windows only |
Date: | 2023-10-11 03:40:26 |
Message-ID: | 20231011034026.75@rfd.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Oct 08, 2023 at 10:00:03PM -0700, David G. Johnston wrote:
> On Sun, Oct 8, 2023 at 9:10 PM Noah Misch <noah(at)leadboat(dot)com> wrote:
> > I didn't think of any phrasing that clearly explained things without the
> > reader consulting the code. I considered these:
I'm leaning toward:
"socket file descriptor out of range for select(): %d" [style guide violation]
A true style guide purist might bury it in a detail message:
ERROR: socket file descriptor out of range: %d
DETAIL: select() accepts file descriptors from 0 to 1023, inclusive, in this build.
HINT: Try fewer concurrent database clients.
> > "socket file descriptor out of range: %d" [what range?]
> >
> Quick drive-by...but it seems that < 0 is a distinctly different problem
> than exceeding FD_SETSIZE. I'm unsure what would cause the former but the
> error for the later seems like:
>
> error: "Requested socket file descriptor %d exceeds connection limit of
> %d", fd, FD_SETSIZE-1
> hint: Reduce the requested number of concurrent connections
>
> In short, the concept of range doesn't seem applicable here. There is an
> error state at the max, and some invalid system state condition where the
> position within a set is somehow negative. These should be separated -
> with the < 0 check happening first.
I view it as: the range of select()-able file descriptors is [0,FD_SETSIZE).
Negative is out of range.
> And apparently this condition isn't applicable when dealing with jobs in
> connect_slot?
For both pgbench.c and parallel_slot.c, there are sufficient negative-FD
checks elsewhere in the file. Ideally, either both files would have redundant
checks, or neither file would. I didn't feel the need to mess with that part
of the status quo.
> Also, I see that for connections we immediately issue FD_SET
> so this check on the boundary of the file descriptor makes sense. But the
> remaining code in connect_slot doesn't involve FD_SET so the test for the
> file descriptor falling within its maximum seems to be coming out of
> nowhere. Likely this is all good, and the lack of symmetry is just due to
> the natural progressive development of the code, but it stands out to the
> uninitiated looking at this patch.
True. The approach in parallel_slot.c is to check the FD number each time it
opens a socket, after which its loop with FD_SET() doesn't recheck. That's a
bit more efficient than the pgbench.c way, because each file may call FD_SET()
many times per socket. Again, I didn't mess with that part of the status quo.
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2023-10-11 03:40:28 | Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag |
Previous Message | Andres Freund | 2023-10-11 03:39:29 | Re: stopgap fix for signal handling during restore_command |