Re: Log connection establishment timings

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
Cc: Melanie Plageman <melanieplageman(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Daniel Gustafsson <daniel(at)yesql(dot)se>, andrey(dot)chudnovskiy(at)microsoft(dot)com, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>
Subject: Re: Log connection establishment timings
Date: 2025-01-20 15:01:38
Message-ID: Z45lUorFBcmCKxp7@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Mon, Jan 06, 2025 at 10:08:36PM +0100, Guillaume Lelarge wrote:
> Hello,
>
> Le lun. 16 déc. 2024 à 22:00, Melanie Plageman <melanieplageman(at)gmail(dot)com>
> a écrit :
>
> > Hi,
> >
> > Users wishing to debug slow connection establishment have little
> > visibility into which steps take the most time. We don't expose any
> > stats and none of the logging includes durations.
> >
> > The attached patch logs durations for authentication, forking the
> > Postgres backend, and the end-to-end connection establishment duration
> > starting from when the postmaster calls accept() and ending the first
> > time the forked backend is ready for query.
> >
> > As an example of how this could be useful, I artificially slowed down
> > authentication and you can see that that is now visible in logs:
> >
> > LOG: backend ready for query. pid=86341. socket=10. connection
> > establishment times (ms): total: 8, fork: 0, authentication: 0
> > LOG: backend ready for query. pid=86794. socket=10. connection
> > establishment times (ms): total: 108, fork: 0, authentication: 100
> >
> > Two notes on implementation:
> >
> > To make this portable, the timestamps captured in the postmaster
> > (socket creation time, fork initiation time) are passed through the
> > ClientSocket and BackendStartupData structures instead of simply saved
> > in backend local memory inherited by the child process.
> >
> > Secondly, I used TimestampTz for the times, but it is pretty unclear
> > to me when instr_time should be used vs TimestampTz for such things.
> > instr_time says it is more portable, but there are all sorts of places
> > that use TimestampTz that would probably need to be portable.
> > Also, INSTR_TIME_SUBTRACT() isn't overflow safe, which must be because
> > the actual data type of ticks is platform dependent and so the caller
> > is responsible for not passing it values that would overflow when
> > subtracted.
> >
> >
> Just a quick note. I like this patch a lot and it would help to diagnose
> some situations our customers may have. I tried the patch on my laptop, and
> it works fine. I'll find some time to read the code as well, but in the
> meantime, this looks like a nice thing to have in PostgreSQL.

Also +1 on the idea.

The patch needed a rebase due to 34486b6092e. I did it in v2 attached (it's
a minor rebase due to the AmRegularBackendProcess() introduction in miscadmin.h).

v2 could rely on AmRegularBackendProcess() instead of AmClientBackendProcess()
but I kept it with AmClientBackendProcess() to reduce "my" changes as compared to
v1.

Regarding the TimestampTz vs instr_time choice, we have things like:

+ TimestampTz fork_time = ((BackendStartupData *) startup_data)->fork_time;
+ TimestampTz cur_time = GetCurrentTimestamp();
+
+ conn_timing.fork_duration = TimestampDifferenceMicroseconds(fork_time,
+ cur_time);

but looking at TimestampDifferenceMicroseconds():

"
/* If start_time is in the future or now, no time has elapsed */
if (start_time >= stop_time)
return 0;
"

I think that it can happen due to time changes.

So with TimestampTz, we would:

1. return 0 if we moved the time backward
2. provide an inflated duration including the time jump (if the time move
forward).

But with instr_time (and on systems that support CLOCK_MONOTONIC) then
pg_clock_gettime_ns() should not be affected by system time change IIUC.

Though time changes are "rare", given the fact that those metrics could provide
"inaccurate" measurements during that particular moment (time change) then it
might be worth considering instr_time instead for this particular metric.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v2-0001-Add-connection-establishment-duration-logging.patch text/x-diff 12.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhou, Zhiguo 2025-01-20 15:04:40 Re: [RFC] Lock-free XLog Reservation from WAL
Previous Message Ilia Evdokimov 2025-01-20 14:20:10 Re: Sample rate added to pg_stat_statements