Quick Links

Re: Conflicting updates of command progress

From:	Sami Imseih <samimseih(at)gmail(dot)com>
To:	Antonin Houska <ah(at)cybertec(dot)at>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Conflicting updates of command progress
Date:	2025-04-23 20:20:27
Message-ID:	CAA5RZ0v5tYwsdSxQaFUh0xkHPnY62m55qZ=a7k6EewbP-zCMvA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

>> pgstat_progress_start_command should only be called once by the entry
>> point for the
>> command. In theory, we could end up in a situation where start_command
>> is called multiple times during the same top-level command;

> Not only in theory - it actually happens when CLUSTER is rebuilding indexes.

In the case of CLUSTER, pgstat_progress_start_command is only called once,
but pgstat_progress_update_param is called in the context of both CLUSTER
and CREATE INDEX.

> That's a possible approach. However, if the index build takes long time in the
> CLUSTER case, the user will probably be interested in details about the index
> build.

I agree,

>> Is there a repro that you can share that shows the weird values? It sounds like
>> the repro is on top of [1]. Is that right?

>> You can reproduce the similar problem by creating a trigger function that
>> runs a progress-reporting command like COPY, and then COPY data into
>> a table that uses that trigger.

>> [2] https://commitfest.postgresql.org/patch/5282/

In this case, pgstat_progress_start_command is actually called
twice in the life of a single COPY command; the upper-level COPY
command calls pgstat_progress_start_command and then the nested COPY
command also does calls pgstat_progress_start_command.

> I think that can be implemented by moving the progress related fields from
> PgBackendStatus into a new structure and by teaching the backend to insert a
> new instance of that structure into a shared hash table (dshash.c)

I think this is a good idea in general to move the backend progress to
shared memory.
and with a new API that will deal with scenarios as described above.
1/ an (explicit) nested
command was started by a top-level command, such as the COPY case above.
2/ a top-level command triggered some other progress code implicitly, such as
CLUSTER triggering CREATE INDEX code.

I also like the shared memory approach because we can then not have to use
a message like the one introduced in f1889729dd3ab0 to support parallel index
vacuum progress 46ebdfe164c61.

--
Sami Imseih
Amazon Web Services (AWS)

In response to

Re: Conflicting updates of command progress at 2025-04-23 10:26:12 from Antonin Houska

Responses

Re: Conflicting updates of command progress at 2025-04-24 09:33:45 from Antonin Houska

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jacob Champion	2025-04-23 20:26:33	Re: [PATCH] Support older Pythons in oauth_server.py
Previous Message	Christoph Berg	2025-04-23 20:12:55	Re: [PoC] Federated Authn/z with OAUTHBEARER