From: | Rahila Syed <rahilasyed90(at)gmail(dot)com> |
---|---|
To: | Daniel Gustafsson <daniel(at)yesql(dot)se> |
Cc: | Tomas Vondra <tomas(at)vondra(dot)me>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Enhancing Memory Context Statistics Reporting |
Date: | 2025-03-13 13:56:51 |
Message-ID: | CAH2L28vULzqoit+YCKR5UhdT+AR+b1Qcs6Hgpz6nQz6NBT2jug@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Please find attached updated and rebased patches. It has the following
changes
1. To prevent memory leaks, ensure that the latest statistics published by
a process
are freed before it exits. This can be achieved by calling dsa_free in the
before_shmem_exit callback.
2. Add a level column to maintain consistency with the output of
pg_backend_memory_contexts.
Thank you,
Rahila Syed
On Tue, Mar 4, 2025 at 12:30 PM Rahila Syed <rahilasyed90(at)gmail(dot)com> wrote:
> Hi Daniel,
>
> Thanks for the rebase, a few mostly superficial comments from a first
>> read-through.
>>
> Thank you for your comments.
>
>
>> The documentation refers to attributes in the return row but the format
>> of that
>> row isn't displayed which makes following along hard. I think we should
>> include a table or a programlisting showing the return data before this
>> paragraph.
>>
>> I included the sql function call and its output in programlisting format
> after the
> function description.
> Since the description was part of a table, I added this additional
> information at the
> end of that table.
>
>
>> +const char *
>> +AssignContextType(NodeTag type)
>> This function doesn't actually assign anything so the name is a bit
>> misleading,
>> it would be better with ContextTypeToString or something similar.
>>
>> Done.
>
>
>>
>> + * By default, only superusers or users with PG_READ_ALL_STATS are
>> allowed to
>> This sentence is really long and should probably be broken up.
>>
>> Fixed.
>
>>
>> + * The shared memory buffer has a limited size - it the process has too
>> many
>> s/it/if/
>>
>> Fixed.
>
>
>> + * If the publishing backend does not respond before the condition
>> variable
>> + * times out, which is set to MEMSTATS_WAIT_TIMEOUT, retry for max_tries
>> + * number of times, which is defined by user, before giving up and
>> + * returning previously published statistics, if any.
>> This comment should mention what happens if the process gives up and
>> there is
>> no previously published stats.
>>
>> Done.
>
>
>>
>> + int i;
>> ...
>> + for (i = 0; i < memCtxState[procNumber].total_stats; i++)
>> This can be rewritten as "for (int i = 0; .." since we allow C99.
>>
>> Done.
>
>
>>
>> + * process running and consuming lots of memory, that it might end on
>> its
>> + * own first and its memory contexts are not logged is not a problem.
>> This comment is copy/pasted from pg_log_backend_memory_contexts and while
>> it
>> mostly still apply it should at least be rewritten to not refer to
>> logging as
>> this function doesn't do that.
>>
>> Fixed.
>
>
>>
>> + ereport(WARNING,
>> + (errmsg("PID %d is not a PostgreSQL server process",
>> No need to add the extra parenthesis around errmsg anymore, so I think
>> new code
>> should omit those.
>>
>> Done.
>
>
>>
>> + errhint("Use pg_backend_memory_contexts view instead")));
>> Super nitpick, but errhints should be complete sentences ending with a
>> period.
>>
>> Done.
>
>
>>
>> + * statitics have previously been published by the backend. In which
>> case,
>> s/statitics/statistics/
>>
>> Fixed.
>
>
>>
>> + * statitics have previously been published by the backend. In which
>> case,
>> + * check if statistics are not older than curr_timestamp, if they are
>> wait
>> I think the sentence around the time check is needlessly confusing, could
>> it be
>> rewritten into something like:
>> "A valid DSA pointer isn't proof that statistics are available, it
>> can be
>> valid due to previously published stats. Check if the stats are
>> updated by
>> comparing the timestamp, if the stats are newer than our previously
>> recorded timestamp from before sending the procsignal they must by
>> definition be updated."
>>
>> Replaced accordingly.
>
>
>>
>> + /* Assert for dsa_handle to be valid */
>> Was this intended to be turned into an Assert call? Else it seems better
>> to remove.
>>
>
> Added an assert and removed the comment.
>
>
>> + if (print_location != PRINT_STATS_NONE)
>> This warrants a comment stating why it makes sense.
>>
>
>> + * Do not print the statistics if print_to_stderr is PRINT_STATS_NONE,
>> s/print_to_stderr/print_location/. Also, do we really need
>> print_to_stderr in
>> this function at all? It seems a bit awkward to combine a boolean and a
>> paramter for a tri-state value when the parameter holds the tri_state
>> already.
>> For readability I think just checking print_location will be better since
>> the
>> value will be clear, where print_to_stderr=false is less clear in a
>> tri-state
>> scenario.
>>
>> I removed the boolean print_to_stderr, the checks are now using
> the tri-state enum-print_location.
>
>
>> + * its ancestors to a list, inorder to compute a path.
>> s/inorder/in order/
>>
>> Fixed.
>
>
>>
>> + elog(LOG, "hash table corrupted, can't construct path value");
>> + break;
>> This will return either a NIL list or a partial path, but
>> PublishMemoryContext
>> doesn't really take into consideration that it might be so. Is this
>> really
>> benign to the point that we can blindly go on? Also, elog(LOG..) is
>> mostly for
>> tracing or debugging as elog() isn't intended for user facing errors.
>>
>> I agree that this should be addressed. I added a check for path value
> before
> storing it in shared memory. If the path is NIL, the path pointer in DSA
> will point
> to InvalidDsaPointer.
> When a client encounters an InvalidDsaPointer it will print NULL in the
> path column.
> Partial path scenario is unlikely IMO, and I am not sure if it warrants
> additional
> checks.
>
>
>> +static void
>> +compute_num_of_contexts(List *contexts, HTAB *context_id_lookup,
>> + int *stats_count, bool get_summary)
>> This function does a lot than compute the number of contexts so the name
>> seems
>> a bit misleading. Perhaps a rename to compute_contexts() or something
>> similar?
>>
>> Renamed to compute_contexts_count_and_ids.
>
>
>>
>> + memctx_info[curr_id].name = dsa_allocate0(area,
>> + strlen(clipped_ident) + 1);
>> These calls can use idlen instead of more strlen() calls no? While there
>> is no
>> performance benefit, it would increase readability IMHO if the code first
>> calculates a value, and then use it.
>>
>> Done.
>
>
>>
>> + strncpy(name,
>> + clipped_ident, strlen(clipped_ident));
>> Since clipped_ident has been nul terminated earlier there is no need to
>> use
>> strncpy, we can instead use strlcpy and take the target buffer size into
>> consideration rather than the input string length.
>>
>> Replaced with the strlcpy calls.
>
>
>>
>> PROCSIG_LOG_MEMORY_CONTEXT, /* ask backend to log the memory contexts
>> */
>> + PROCSIG_GET_MEMORY_CONTEXT, /* ask backend to log the memory contexts
>> */
>> This comment should be different from the LOG_MEMORY_xx one.
>>
>> Fixed.
>
> +#define MEM_CONTEXT_SHMEM_STATS_SIZE 30
>> +#define MAX_TYPE_STRING_LENGTH 64
>> These are unused, from an earlier version of the patch perhaps?
>>
>> Removed
>
> + * Singe DSA area is created and used by all the processes,
>> s/Singe/Since/
>>
>
> Fixed.
>
> +typedef struct MemoryContextBackendState
>> This is only used in mcxtfuncs.c and can be moved there rather than being
>> exported in the header.
>>
>
> This is being used in mcxt.c too in the form of the variable memCtxState.
>
>
>>
>
> +} MemoryContextId;
>> This lacks an entry in the typedefs.list file.
>>
>> Added.
>
> Please find attached the updated patches with the above fixes.
>
> Thank you,
> Rahila Syed
>
Attachment | Content-Type | Size |
---|---|---|
v17-0001-Preparatory-changes-for-reporting-memory-context-sta.patch | application/octet-stream | 4.3 KB |
v17-0002-Function-to-report-memory-context-statistics.patch | application/octet-stream | 54.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Matheus Alcantara | 2025-03-13 13:59:17 | Re: dblink: Add SCRAM pass-through authentication |
Previous Message | Dilip Kumar | 2025-03-13 13:56:28 | Re: Add an option to skip loading missing publication to avoid logical replication failure |