From: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Nathan Bossart <nathandbossart(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Track the amount of time waiting due to cost_delay |
Date: | 2024-06-11 07:25:11 |
Message-ID: | Zmf712A5xcOM9Hlg@ip-10-97-1-34.eu-west-3.compute.internal |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Mon, Jun 10, 2024 at 05:58:13PM -0400, Robert Haas wrote:
> On Mon, Jun 10, 2024 at 11:36 AM Nathan Bossart
> <nathandbossart(at)gmail(dot)com> wrote:
> > Hm. Should we measure the actual time spent sleeping, or is a rough
> > estimate good enough? I believe pg_usleep() might return early (e.g., if
> > the process is signaled) or late, so this field could end up being
> > inaccurate, although probably not by much. If we're okay with millisecond
> > granularity, my first instinct is that what you've proposed is fine, but I
> > figured I'd bring it up anyway.
>
> I bet you could also sleep for longer than planned, throwing the
> numbers off in the other direction.
Thanks for looking at it! Agree, that's how I read "or late" from Nathan's
comment above.
> I'm always suspicious of this sort of thing. I tend to find nothing
> gives me the right answer unless I assume that the actual sleep times
> are randomly and systematically different from the intended sleep
> times but arbitrarily large amounts. I think we should at least do
> some testing: if we measure both the intended sleep time and the
> actual sleep time, how close are they? Does it change if the system is
> under crushing load (which might elongate sleeps) or if we spam
> SIGUSR1 against the vacuum process (which might shorten them)?
OTOH Sami proposed in [1] to count the number of times the vacuum went into
delay. I think that's a good idea. His idea makes me think that (in addition to
the number of wait times) it would make sense to measure the "actual" sleep time
(and not the intended one) then (so that one could measure the difference between
the intended wait time (number of wait times * cost delay (if it does not change
during the vacuum duration)) and the actual measured wait time).
So I think that in v2 we could: 1) measure the actual wait time instead, 2)
count the number of times the vacuum slept. We could also 3) reports the
effective cost limit (as proposed by Nathan up-thread) (I think that 3) could
be misleading but I'll yield to majority opinion if people think it's not).
Thoughts?
[1]: https://www.postgresql.org/message-id/A0935130-7C4B-4094-B6E4-C7D5086D50EF%40amazon.com
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2024-06-11 07:47:31 | Re: Trying out read streams in pgvector (an extension) |
Previous Message | Kyotaro Horiguchi | 2024-06-11 07:15:48 | Re: 001_rep_changes.pl fails due to publisher stuck on shutdown |