Re: Restart pg_usleep when interrupted

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Restart pg_usleep when interrupted
Date: 2024-07-05 16:49:45
Message-ID: 373D199B-2312-431D-8018-F2808913F979@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> With 50 indexes and 10 parallel workers I can see things like:
>
> 2024-07-02 08:22:23.789 UTC [2189616] LOG: expected 1.000000, actual 239.378368
> 2024-07-02 08:22:24.575 UTC [2189616] LOG: expected 0.100000, actual 224.331737
> 2024-07-02 08:22:25.363 UTC [2189616] LOG: expected 1.300000, actual 230.462793
> 2024-07-02 08:22:26.154 UTC [2189616] LOG: expected 1.000000, actual 225.980803
>
> Means we waited more than the max allowed cost delay (100ms).
>
> With 49 parallel workers, it's worst as I can see things like:
>
> 2024-07-02 08:26:36.069 UTC [2189807] LOG: expected 1.000000, actual 1106.790136
> 2024-07-02 08:26:36.298 UTC [2189807] LOG: expected 1.000000, actual 218.148985
>
> The first actual wait time is about 1 second (it has been interrupted about
> 16300 times during this second).
>
> To avoid this drift, the nanosleep() man page suggests to use clock_nanosleep()
> with an absolute time value, that might be another idea to explore.
>
> [1]: https://www.postgresql.org/message-id/flat/ZmaXmWDL829fzAVX%40ip-10-97-1-34.eu-west-3.compute.internal
>

I could not reproduce the same time you drift you observed on my
machine, so I am guessing the time drift could be worse on certain
platforms than others.

I also looked into the WaitLatchUs patch proposed by Thomas in [1]
and since my system does have epoll_pwait(2) available, I could not
achieve the sub-millisecond wait times.

A more portable approach which could be to continue using nanosleep and
add checks to ensure that nanosleep exists whenever
it goes past an absolute time. This was suggested by Bertrand in an offline
conversation. I am not yet fully convinced of this idea, but posting the patch
that implements this idea for anyone interested in looking.

Since sub-millisecond sleep times are not guaranteed as suggested by
the vacuum_cost_delay docs ( see below ), an alternative idea
is to use clock_nanosleep for vacuum delay when it’s available, else
fallback to WaitLatch.

"While vacuum_cost_delay can be set to fractional-millisecond values,
such delays may not be measured accurately on older platforms”

[1] https://www.postgresql.org/message-id/CA%2BhUKGKVbJE59JkwnUj5XMY%2B-rzcTFciV9vVC7i%3DLUfWPds8Xw%40mail.gmail.com

Regards,

Sami

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sami Imseih 2024-07-05 16:57:37 Re: Restart pg_usleep when interrupted
Previous Message Peter Geoghegan 2024-07-05 16:47:49 Avoiding superfluous buffer locking during nbtree backwards scans