BUG #18096: In edge-triggered epoll and kqueue, PQconsumeInput/PQisBusy are insufficient for correct async ops.

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: mah0x211(at)gmail(dot)com
Subject: BUG #18096: In edge-triggered epoll and kqueue, PQconsumeInput/PQisBusy are insufficient for correct async ops.
Date: 2023-09-08 00:26:01
Message-ID: 18096-2bd930f8f132fb4e@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 18096
Logged by: Masatoshi Fukunaga
Email address: mah0x211(at)gmail(dot)com
PostgreSQL version: 14.4
Operating system: macOS 13.5.1, Ubuntu 22.04
Description:

When processing asynchronous commands, I call the `PQconsumeInput` and
`PQisBusy` functions to check if data has arrived, as shown below, but this
does not work correctly in edge trigger mode for epoll and kqueue.

In the edge trigger mode of epoll and kqueue, calls to the
`PQconsumeInput()` and `PQisBusy()` funct

I believe the following code is correct in the way it is instructed in the
manual.

> 34.4. Asynchronous Command Processing, the following is written.
> https://www.postgresql.org/docs/current/libpq-async.html

```C
// on edge-trigger mode, this code does not work correctly
/**
* check if the result is readable or not
* @return 1: readable, 0: not readable, -1: error
*/
int is_readable(PGconn *conn) {
if (!PQconsumeInput(conn)) {
// caller should call PQerrorMessage to get error message
return -1;
} else if (!PQisBusy(conn)) {
// caller can call PQgetResult to get the result
return 1;
}
// caller should be wait for the socket to become readable
return 0;
}
```

The `PQconsumeInput()` function reads input data by calling the
`pqReadData()` function internally and using the `pqsecure_read()` function
is used to read the input data.

However, the `pqReadData()` function will not call the `pqsecure_read()`
function until the `errno` is set to `EAGAIN` or `EWOULDBLOCK`, so if you
poll after the `PQisBusy()` call returns `1`, readable event will not fire
and will be permanently in a wait state.

By the way, I am aware that even if the result of a read by
`pqsecure_read()` does not result in `EAGAIN` or `EWOULDBLOCK`, the event
will still be raised if all the data in the socket has been read.

The problem seems to be that `PQisBusy()` is returning `1`, but the
preceding call to `PQconsumeInput()` has not read all the data in the
socket.

So, if I check the errno and branch the process as follows, it works fine.

```C
/**
* check if the result is readable or not
* @return 1: readable, 0: not readable, -1: error
*/
int is_readable(PGconn *conn) {
int should_retry = 0;

RETRY:
errno = 0;
if (!PQconsumeInput(conn)) {
// caller should call PQerrorMessage to get error message
return -1;
}
should_retry = errno != EAGAIN && errno != EWOULDBLOCK;

if (!PQisBusy(conn)) {
// caller can call PQgetResult to get the result
return 1;
} else if(should_retry) {
// it is necessary to retry because the data has not been read
completely
goto RETRY;
}
// caller should be wait for the socket to become readable
return 0;
}
```

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2023-09-08 02:48:28 Re: FW: query pg_stat_ssl hang 100%cpu
Previous Message Thomas Munro 2023-09-07 23:45:51 Re: FW: query pg_stat_ssl hang 100%cpu