From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | "Syed, Rahila" <Rahila(dot)Syed(at)nttdata(dot)com> |
Cc: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, "PostgreSQL-development (pgsql-hackers(at)postgresql(dot)org)" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PROPOSAL] VACUUM Progress Checker. |
Date: | 2015-10-22 15:10:21 |
Message-ID: | 20151022151021.GI3391@alvherre.pgsql |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Syed, Rahila wrote:
> @@ -355,6 +356,7 @@ vacuum(int options, RangeVar *relation, Oid relid, VacuumParams *params,
> vac_update_datfrozenxid();
> }
>
> + pgstat_reset_activityflag;
> /*
> * Clean up working storage --- note we must do this after
> * StartTransactionCommand, else we might be trying to delete the active
Does this actually compile?
> @@ -596,11 +630,42 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
> /* Log cleanup info before we touch indexes */
> vacuum_log_cleanup_info(onerel, vacrelstats);
>
> + /*
> + * If passes through indexes exceed 1 add
> + * pages equal to rel_index_pages to the count of
> + * total pages to be scanned.
> + */
> + if (vacrelstats->num_index_scans >= 1)
> + {
> + total_index_pages = total_index_pages + rel_index_pages;
> + total_pages = total_heap_pages + total_index_pages;
> + }
Having the keep total_pages updated each time you change one of the
summands seems tedious and error-prone. Why can't it be computed
whenever it is going to be used instead?
> + memcpy((char *) progress_message[0], schemaname, schemaname_len);
> + progress_message[0][schemaname_len] = '\0';
> + strcat(progress_message[0],".");
> + strcat(progress_message[0],relname);
snprintf()? I don't think you need to keep track of schemaname_len at
all.
> + scanned_index_pages += RelationGetNumberOfBlocks(Irel[i]);
> + scanned_total_pages = scanned_total_pages + RelationGetNumberOfBlocks(Irel[i]);
> + /* Report progress to the statistics collector */
> + progress_param[0] = total_pages;
> + progress_param[1] = scanned_total_pages;
> + progress_param[2] = total_heap_pages;
> + progress_param[3] = vacrelstats->scanned_pages;
> + progress_param[4] = total_index_pages;
> + progress_param[5] = scanned_index_pages;
In fact, I wonder if you need to send total_pages at all -- surely the
client can add both total_heap_pages and total_index_pages by itself ...
> + memcpy((char *) progress_message[0], schemaname, schemaname_len);
> + progress_message[0][schemaname_len] = '\0';
> + strcat(progress_message[0],".");
> + strcat(progress_message[0],relname);
snprintf().
> diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
> index ab018c4..f97759e 100644
> --- a/src/backend/postmaster/pgstat.c
> +++ b/src/backend/postmaster/pgstat.c
> @@ -2851,6 +2851,55 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
> pgstat_increment_changecount_after(beentry);
> }
>
> +/* ---------------------------------------------
> + * Called from VACUUM after every heap page scan or index scan
> + * to report progress
> + * ---------------------------------------------
> + */
> +
> +void
> +pgstat_report_progress(uint *param1, int num_of_int, double *param2, int num_of_float,
> + char param3[PROGRESS_MESSAGE_LENGTH][N_PROGRESS_PARAM],
> + int num_of_string)
> +{
> + volatile PgBackendStatus *beentry = MyBEEntry;
> + int i;
> +
> + if (!beentry)
> + return;
> +
> + if (!pgstat_track_activities)
> + return;
> +
> + pgstat_increment_changecount_before(beentry);
> +
> + for(i = 0; i < num_of_int; i++)
> + {
> + beentry->progress_param[i] = param1[i];
> + }
> + for (i = 0; i < num_of_float; i++)
> + {
> + beentry->progress_param_float[i] = param2[i];
> + }
> + for (i = 0; i < num_of_string; i++)
> + {
> + strcpy((char *)beentry->progress_message[i], param3[i]);
> + }
> + pgstat_increment_changecount_after(beentry);
> +}
It seems a bit strange that the remaining progress_param entries are not
initialized to anything. Also, why aren't the number of params of each
type saved too? In the receiving code you check whether each value
equals 0, and if it does then report NULL, but imagine vacuuming a table
with no indexes where the number of index pages is going to be zero.
Shouldn't we display zero there rather than null? Maybe I'm missing
something and that does work fine.
This patch lacks a comment somewhere explaining how this whole thing
works.
> diff --git a/src/include/pgstat.h b/src/include/pgstat.h
> index 9ecc163..4214b3d 100644
> --- a/src/include/pgstat.h
> +++ b/src/include/pgstat.h
> @@ -20,6 +20,7 @@
> #include "utils/hsearch.h"
> #include "utils/relcache.h"
>
> +#include "storage/block.h"
I believe you don't need this include.
> @@ -776,6 +779,12 @@ typedef struct PgBackendStatus
>
> /* current command string; MUST be null-terminated */
> char *st_activity;
> +
> + uint32 flag_activity;
> + uint32 progress_param[N_PROGRESS_PARAM];
> + double progress_param_float[N_PROGRESS_PARAM];
> + char progress_message[PROGRESS_MESSAGE_LENGTH][N_PROGRESS_PARAM];
> +
> } PgBackendStatus;
This not only adds an unnecessary empty line at the end of the struct
declaration, but also fails to preserve the "st_" prefix used in all the
other fields.
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Fabien COELHO | 2015-10-22 15:16:13 | Re: pgbench throttling latency limit |
Previous Message | Robert Haas | 2015-10-22 14:46:00 | Re: a raft of parallelism-related bug fixes |