Re: Restart pg_usleep when interrupted

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Restart pg_usleep when interrupted
Date: 2024-07-12 17:14:56
Message-ID: 01A15AEA-C35C-41DF-8E81-3B5A0B523939@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> I'm imagining something like this:
>
> struct timespec delay;
> TimestampTz end_time;
>
> end_time = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), msec);
>
> do
> {
> long secs;
> int microsecs;
>
> TimestampDifference(GetCurrentTimestamp(), end_time,
> &secs, &microsecs);
>
> delay.tv_sec = secs;
> delay.tv_nsec = microsecs * 1000;
>
> } while (nanosleep(&delay, NULL) == -1 && errno == EINTR);
>

I do agree that this is cleaner code, but I am not sure I like this.

1/ TimestampDifference has a dependency on gettimeofday,
while my proposal utilizes clock_gettime. There are old discussions
that did not reach a conclusion comparing both mechanisms.
My main conclusion from these hacker discussions [1], [2] and other
online discussions on the topic is clock_gettime should replace
getimeofday when possible. Precision is the main reason.

2/ It no longer uses the remain time. I think the remain time
is still required here. I did a unrealistic stress test which shows
the original proposal can handle frequent interruptions much better.

#1 in one session kicked off a vacuum

set vacuum_cost_delay = 10;
set vacuum_cost_limit = 1;
set client_min_messages = log;
update large_tbl set version = 1;
vacuum (verbose, parallel 4) large_tbl;

#2 in another session, ran a loop to continually
interrupt the vacuum leader. This was during the
“heap scan” phase of the vacuum.

PID=< pid of vacuum leader >
while :
do
kill -USR1 $PID
done

Using the proposed loop with the remainder, I noticed that
the actual time reported remains close to the requested
delay time.

LOG: 10.000000,10.013420
LOG: 10.000000,10.011188
LOG: 10.000000,10.010860
LOG: 10.000000,10.014839
LOG: 10.000000,10.004542
LOG: 10.000000,10.006035
LOG: 10.000000,10.012230
LOG: 10.000000,10.014535
LOG: 10.000000,10.009645
LOG: 10.000000,10.000817
LOG: 10.000000,10.002162
LOG: 10.000000,10.011721
LOG: 10.000000,10.011655

Using the approach mentioned by Nathan, there
are large differences between requested and actual time.

LOG: 10.000000,17.801778
LOG: 10.000000,12.795450
LOG: 10.000000,11.793723
LOG: 10.000000,11.796317
LOG: 10.000000,13.785993
LOG: 10.000000,11.803775
LOG: 10.000000,15.782767
LOG: 10.000000,31.783901
LOG: 10.000000,19.792440
LOG: 10.000000,21.795795
LOG: 10.000000,18.800412
LOG: 10.000000,16.782886
LOG: 10.000000,10.795197
LOG: 10.000000,14.793333
LOG: 10.000000,29.806556
LOG: 10.000000,18.810784
LOG: 10.000000,11.804956
LOG: 10.000000,24.809812
LOG: 10.000000,25.815600
LOG: 10.000000,22.809493
LOG: 10.000000,22.790908
LOG: 10.000000,19.699097
LOG: 10.000000,23.795613
LOG: 10.000000,24.797078

Let me know what you think?

[1] https://www.postgresql.org/message-id/flat/31856.1400021891%40sss.pgh.pa.us
[2] https://www.postgresql.org/message-id/flat/E1cO7fR-0003y0-9E%40gemulon.postgresql.org

Regards,

Sami

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2024-07-12 17:22:09 Re: Adding OLD/NEW support to RETURNING
Previous Message Nathan Bossart 2024-07-12 16:49:38 Re: Remove dependence on integer wrapping