Re: Sub-millisecond [autovacuum_]vacuum_cost_delay broken

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: Sub-millisecond [autovacuum_]vacuum_cost_delay broken
Date: 2023-03-14 02:38:45
Message-ID: CA+hUKGL=OkAsHBS_TH3v3SRCi3AZd9r2+8PpJ4DR=P9xvnhF5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 14, 2023 at 12:10 PM Nathan Bossart
<nathandbossart(at)gmail(dot)com> wrote:
> > * NOTE: although the delay is specified in microseconds, the effective
> > - * resolution is only 1/HZ, or 10 milliseconds, on most Unixen. Expect
> > - * the requested delay to be rounded up to the next resolution boundary.
> > + * resolution is only 1/HZ on systems that use periodic kernel ticks to wake
> > + * up. This may cause sleeps to be rounded up by 1-20 milliseconds on older
> > + * Unixen and Windows.
>
> nitpick: Could the 1/HZ versus 20 milliseconds discrepancy cause confusion?
> Otherwise, I think this is the right idea.

Better words welcome; 1-20ms summarises the range I actually measured,
and if reports are correct about Windows' HZ=64 (1/HZ = 15.625ms) then
it neatly covers that too, so I don't feel too bad about not chasing
down the reason for that 10ms/20ms discrepancy; maybe I looked at the
wrong HZ number (which you can change, anyway), I'm not too used to
NetBSD... BTW they have a project plan to fix that
https://wiki.netbsd.org/projects/project/tickless/

> > + * CAUTION: if interrupted by a signal, this function will return, but its
> > + * interface doesn't report that. It's not a good idea to use this
> > + * for long sleeps in the backend, because backends are expected to respond to
> > + * interrupts promptly. Better practice for long sleeps is to use WaitLatch()
> > + * with a timeout.
>
> I'm not sure this argument follows. If pg_usleep() returns if interrupted,
> then why are we concerned about delayed responses to interrupts?

Because you can't rely on it:

1. Maybe the signal is delivered just before pg_usleep() begins, and
a handler sets some flag we would like to react to. Now pg_usleep()
will not be interrupted. That problem is solved by using latches
instead.
2. Maybe the signal is one that is no longer handled by a handler at
all; these days, latches use SIGURG, which pops out when you read a
signalfd or kqueue, so pg_usleep() will not wake up. That problem is
solved by using latches instead.

(The word "interrupt" is a bit overloaded, which doesn't help with
this discussion.)

> > - delay.tv_usec = microsec % 1000000L;
> > - (void) select(0, NULL, NULL, NULL, &delay);
> > + delay.tv_nsec = (microsec % 1000000L) * 1000;
> > + (void) nanosleep(&delay, NULL);
>
> Using nanosleep() seems reasonable to me.

Thanks for looking!

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-03-14 03:17:03 Re: Allow logical replication to copy tables in binary format
Previous Message Michael Paquier 2023-03-14 02:36:17 Re: psql \watch 2nd argument: iteration count