From: | "Imseih (AWS), Sami" <simseih(at)amazon(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Add index scan progress to pg_stat_progress_vacuum |
Date: | 2022-12-14 05:09:46 |
Message-ID: | 20F7ED07-4D32-4231-A81C-432EB111F0B2@amazon.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> First of all, I don't think we need to declare ParallelVacuumProgress
> in vacuum.c since it's set and used only in vacuumparallel.c. But I
> don't even think it's a good idea to declare it in vacuumparallel.c as
> a static variable. The primary reason is that it adds things we need
> to care about. For example, what if we raise an error during index
> vacuum? The transaction aborts but ParallelVacuumProgress still refers
> to something old. Suppose further that the next parallel vacuum
> doesn't launch any workers, the leader process would still end up
> accessing the old value pointed by ParallelVacuumProgress, which
> causes a SEGV. So we need to reset it anyway at the beginning of the
> parallel vacuum. It's easy to fix at this time but once the parallel
> vacuum code gets more complex, it could forget to care about it.
> IMO VacuumSharedCostBalance and VacuumActiveNWorkers have a different
> story. They are set in vacuumparallel.c and are used in vacuum.c for
> vacuum delay. If they weren't global variables, we would need to pass
> them to every function that could eventually call the vacuum delay
> function. So it makes sense to me to have them as global variables.On
> the other hand, for ParallelVacuumProgress, it's a common pattern that
> ambulkdelete(), amvacuumcleanup() or a common index scan routine like
> btvacuumscan() checks the progress. I don't think index AM needs to
> pass the value down to many of its functions. So it makes sense to me
> to pass it via IndexVacuumInfo.
Thanks for the detailed explanation and especially clearing up
my understanding of VacuumSharedCostBalance and VacuumActiveNWorker.
I do now think that passing ParallelVacuumState in IndexVacuumInfo is
a more optimal choice.
Attached version addresses the above and the previous comments.
Thanks
Sami Imseih
Amazon Web Services (AWS)
Attachment | Content-Type | Size |
---|---|---|
v17-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch | application/octet-stream | 22.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Erik Rijkers | 2022-12-14 05:20:22 | Re: Schema variables - new implementation for Postgres 15 (typo) |
Previous Message | Pavel Stehule | 2022-12-14 04:54:48 | Re: Schema variables - new implementation for Postgres 15 |