From: | Michael Banck <michael(dot)banck(at)credativ(dot)de> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz>, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
Cc: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, alvherre(at)2ndquadrant(dot)com, mailings(at)oopsware(dot)de, thomas(dot)munro(at)enterprisedb(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Progress reporting for pg_verify_checksums |
Date: | 2019-03-30 20:07:41 |
Message-ID: | 1553976461.4884.66.camel@credativ.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
so we are basically back at the original patch as written by Bernd :)
> +/*
> + * Report current progress status. Parts borrowed from
> + * PostgreSQLs' src/bin/pg_basebackup.c
> + */
> +static void
> +progress_report(bool force)
> +{
> + int percent;
> + char total_size_str[32];
> + char current_size_str[32];
> + pg_time_t now;
> +
> + if (!showprogress)
> + return;
The way you've now changed this is that there's a function call into
progress_report() for every block that's being read, even if there is no
progress reporting requested. That looks like a pretty severe
performance problem so I suggest to at least stick with checking
showprogress before calling progress_report() and not the other way
round.
I guess there is still some noticeable performance penalty to pay when
reporting progress, even more so now that you reverted to only doing it
once per second - if we report progress, we presumably call
report_progress() thousands of times and return early 99,9% of the time.
So my vote is in favour of only calling progress_report() every once in
a while - I saw quite a speedup (or removal of slowdown) due to this in
my tests, this was not just some unwarranted microoptimization.
> + fprintf(stderr, "\r");
I think the isatty() check that was in our versions of the patch is
useful here, did you check how this looks when piping the output to a
file?
> @@ -195,12 +254,23 @@ scan_file(const char *fn, BlockNumber segmentno)
> _("%s: checksums enabled in file
\"%s\"\n"), progname, fn);
> }
>
> + /* Make sure progress is reported at least once per file */
> + progress_report(false);
> +
> close(f);
> }
This hunk is from the performance optimization of calling
progress_report only every 1024 blocks and could be removed if we stick
with calling it for every block anyway (but see above).
> -static void
> -scan_directory(const char *basedir, const char *subdir)
> +/*
> + * Scan the given directory for items which can be checksummed and
> + * operate on each one of them. If "sizeonly" is true, the size of
> + * all the items which have checksums is computed and returned back
"computed" is maybe a bit strong a word here, how about "added up"?
Michael
--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael(dot)banck(at)credativ(dot)de
credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2019-03-30 20:13:01 | Re: [HACKERS] PATCH: multivariate histograms and MCV lists |
Previous Message | Tomas Vondra | 2019-03-30 19:25:43 | Re: Enable data checksums by default |