From: | Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com> |
---|---|
To: | Nikita Malakhov <hukutoc(at)gmail(dot)com> |
Cc: | Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: POC: Extension for adding distributed tracing - pg_tracing |
Date: | 2023-08-01 13:19:41 |
Message-ID: | CAO6_Xqp83Rz7CZcb=OqrMxC07VUw08=NrCnVvjWN7arQYc4xNg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Here's a new patch with changes from the previous discussion:
- I'm now directly storing nanoseconds duration in the span instead of the
instr_time. Using the instr_time macros was a bit awkward as the
durations I generate don't necessarily have a starting and ending
instr_time.
Moving to straight nanoseconds made things clearer from my point of view.
- I've added an additional sample rate pg_tracing.sample_rate (on top of
the pg_tracing.caller_sample_rate). This one will allow queries to be
sampled even without trace information propagated from the caller.
Setting this sample rate to 1 will basically trace everything. For now,
this will only work when going through the post parse hook. I will add
support for prepared statements and cached plans for the next patch.
- I've improved how parse spans are created. It's a bit challenging to get
the start of a parse as there's no pre parse hook or instrumentation around
parse so it's only an estimation.
Regards,
Anthonin
On Fri, Jul 28, 2023 at 4:06 PM Anthonin Bonnefoy <
anthonin(dot)bonnefoy(at)datadoghq(dot)com> wrote:
> > What do you think about using INSTR_TIME_SET_CURRENT,
> INSTR_TIME_SUBTRACT and INSTR_TIME_GET_MILLISEC
> > macros for timing calculations?
> If you're talking of the two instances where I'm modifying the
> instr_time's ticks, it's because I can't use the macros there.
> The first case is for the parse span. I only have the start timestamp
> using GetCurrentStatementStartTimestamp and don't
> have access to the start instr_time so I need to build the duration from 2
> timestamps.
> The second case is when building node spans from the planstate. I directly
> have the duration from Instrumentation.
>
> I guess one fix would be to use an int64 for the span duration to directly
> store nanoseconds instead of an instr_time
> but I do use the instrumentation macros outside of those two cases to get
> the duration of other spans.
>
> > Also, have you thought about a way to trace existing (running) queries
> without directly instrumenting them?
> That's a good point. I was focusing on leaving the sampling decision to
> the caller through the sampled flag and
> only recently added the pg_tracing_sample_rate parameter to give more
> control. It should be straightforward to
> add an option to create standalone traces based on sample rate alone. This
> way, setting the sample rate to 1
> would force the queries running in the session to be traced.
>
>
> On Fri, Jul 28, 2023 at 3:02 PM Nikita Malakhov <hukutoc(at)gmail(dot)com> wrote:
>
>> Hi!
>>
>> What do you think about using INSTR_TIME_SET_CURRENT, INSTR_TIME_SUBTRACT
>> and INSTR_TIME_GET_MILLISEC
>> macros for timing calculations?
>>
>> Also, have you thought about a way to trace existing (running) queries
>> without directly instrumenting them? It would
>> be much more usable for maintenance and support personnel, because in
>> production environments you rarely could
>> change query text directly. For the current state the most simple
>> solution is switch tracing on and off by calling SQL
>> function, and possibly switch tracing for prepared statement the same
>> way, but not for any random query.
>>
>> I'll check the patch for the race conditions.
>>
>> --
>> Regards,
>> Nikita Malakhov
>> Postgres Professional
>> The Russian Postgres Company
>> https://postgrespro.ru/
>>
>
Attachment | Content-Type | Size |
---|---|---|
pg-tracing-v2.patch | application/octet-stream | 109.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2023-08-01 13:20:14 | Re: Oversight in reparameterize_path_by_child leading to executor crash |
Previous Message | Aleksander Alekseev | 2023-08-01 13:14:54 | Re: Incorrect handling of OOM in WAL replay leading to data loss |