From: | torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> |
---|---|
To: | Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> |
Cc: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us, gkokolatos(at)protonmail(dot)com, kasahara(dot)tatsuhito(at)gmail(dot)com, craig(at)2ndquadrant(dot)com |
Subject: | Re: Get memory contexts of an arbitrary backend process |
Date: | 2021-03-30 13:06:58 |
Message-ID: | bbecd722d4f8e261b350186ac4bf68a7@oss.nttdata.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2021-03-30 02:28, Fujii Masao wrote:
Thanks for reviewing and kind suggestions!
>> It adds pg_log_backend_memory_contexts(pid) which logs memory contexts
>> of the specified backend process.
>>
>> The number of child contexts to be logged per parent is limited to 100
>> as with MemoryContextStats().
>>
>> As written in commit 7b5ef8f2d07, which limits the verbosity of
>> memory context statistics dumps, it supposes that practical cases
>> where the dump gets long will typically be huge numbers of
>> siblings under the same parent context; while the additional
>> debugging value from seeing details about individual siblings
>> beyond 100 will not be large.
>>
>> Thoughts?
>
> I'm OK with 100. We should comment why we chose 100 for that.
Added following comments.
+ /*
+ * When a backend process is consuming huge memory, logging all
its
+ * memory contexts might overrun available disk space. To
prevent
+ * this, we limit the number of child contexts per parent to
100.
+ *
+ * As with MemoryContextStats(), we suppose that practical cases
+ * where the dump gets long will typically be huge numbers of
+ * siblings under the same parent context; while the additional
+ * debugging value from seeing details about individual siblings
+ * beyond 100 will not be large.
+ */
+ MemoryContextStatsDetail(TopMemoryContext, 100, false);
>
> Here are some review comments.
>
> Isn't it better to move HandleProcSignalLogMemoryContext() and
> ProcessLogMemoryContextInterrupt() to mcxt.c from procsignal.c
> (like the functions for notify interrupt are defined in async.c)
> because they are the functions for memory contexts?
Agreed.
Also renamed HandleProcSignalLogMemoryContext to
HandleLogMemoryContextInterrupt.
> + * HandleProcSignalLogMemoryContext
> + *
> + * Handle receipt of an interrupt indicating log memory context.
> + * Signal handler portion of interrupt handling.
>
> IMO it's better to comment why we need to separate the function into
> two,
> i.e., HandleProcSignalLogMemoryContext() and
> ProcessLogMemoryContextInterrupt(),
> like the comment for other similar function explains. What about the
> followings?
Thanks! Changed them to the suggested one.
> -------------------------------
> HandleLogMemoryContextInterrupt
>
> Handle receipt of an interrupt indicating logging of memory contexts.
>
> All the actual work is deferred to ProcessLogMemoryContextInterrupt(),
> because we cannot safely emit a log message inside the signal handler.
> -------------------------------
> ProcessLogMemoryContextInterrupt
>
> Perform logging of memory contexts of this backend process.
>
> Any backend that participates in ProcSignal signaling must arrange to
> call this function if we see LogMemoryContextPending set. It is called
> from CHECK_FOR_INTERRUPTS(), which is enough because the target process
> for logging of memory contexts is a backend.
> -------------------------------
>
>
> + if (CheckProcSignal(PROCSIG_LOG_MEMORY_CONTEXT))
> + HandleProcSignalLogMemoryContext();
> +
> if (CheckProcSignal(PROCSIG_BARRIER))
> HandleProcSignalBarrierInterrupt();
>
> The code for memory context logging interrupt came after barrier
> interrupt
> in other places, e.g., procsignal.h. Why is this order of code
> different?
Fixed.
> +/*
> + * pg_log_backend_memory_contexts
> + * Print memory context of the specified backend process.
>
> Isn't it better to move pg_log_backend_memory_contexts() to mcxtfuncs.c
> from mcxt.c because this is the SQL function memory contexts?
Agreed.
> IMO we should comment why we allow only superuser to call this
> function.
> What about the following?
Thanks!
Modified the patch according to the suggestions.
> -----------------
> Signal a backend process to log its memory contexts.
>
> Only superusers are allowed to signal to log the memory contexts
> because allowing any users to issue this request at an unbounded rate
> would cause lots of log messages and which can lead to denial of
> service.
> -----------------
>
> + PGPROC *proc = BackendPidGetProc(pid);
> +
> + /* Check whether the target process is PostgreSQL backend process. */
> + if (proc == NULL)
>
> What about adding more comments as follows?
>
> ---------------------------------
> + /*
> + * BackendPidGetProc returns NULL if the pid isn't valid; but by the
> time
> + * we reach kill(), a process for which we get a valid proc here
> might
> + * have terminated on its own. There's no way to acquire a lock on
> an
> + * arbitrary process to prevent that. But since this mechanism is
> usually
> + * used to debug a backend running and consuming lots of memory,
> + * that it might end on its own first and its memory contexts are not
> + * logged is not a problem.
> + */
> + if (proc == NULL)
> + {
> + /*
> + * This is just a warning so a loop-through-resultset will not abort
> + * if one backend logged its memory contexts during the run.
> + */
> + ereport(WARNING,
> ---------------------------------
>
> + ereport(ERROR,
> + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> + errmsg("must be a superuser to log memory contexts")));
> + PG_RETURN_BOOL(false);
>
> This PG_RETURN_BOOL(false) is unnecessary because ereport() will emit
> an ERROR?
>
> +$node->psql('postgres', 'select
> pg_log_backend_memory_contexts(pg_backend_pid())');
> +
> +my $logfile = slurp_file($node->logfile);
> +like($logfile, qr/Grand total: \d+ bytes in \d+ blocks/,
> + 'found expected memory context print in the log file');
>
> Isn't there the case where the memory contexts have not been logged yet
> when slurp_file() is called? It's rare, though. If there is the risk
> for that,
> we should wait for the memory contexts to be actually logged?
>
> BTW, I agree that we should have the regression test that calls
> pg_log_backend_memory_contexts() and checks, e.g., that it doesn't
> cause
> segmentation fault. But I'm just wondering if it's a bit overkill to
> make perl
> scriptand start new node to test only this function...
OK, I removed the perl TAP test and added a regression test running
pg_log_backend_memory_contexts(pg_backend_pid()) and checking it returns
't'.
Regards.
Attachment | Content-Type | Size |
---|---|---|
v6-0001-add-memorycontext-elog-print.patch | text/x-diff | 26.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Arseny Sher | 2021-03-30 13:22:18 | Re: Flaky vacuum truncate test in reloptions.sql |
Previous Message | Julien Rouhaud | 2021-03-30 12:54:41 | Re: Issue with point_ops and NaN |