From: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp |
Cc: | pokurev(at)pm(dot)nttdata(dot)co(dot)jp, pgsql-hackers(at)postgresql(dot)org, bannos(at)nttdata(dot)co(dot)jp |
Subject: | Re: [PROPOSAL] VACUUM Progress Checker. |
Date: | 2016-02-16 09:25:27 |
Message-ID: | 20160216.182527.87385234.horiguchi.kyotaro@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello,
At Tue, 16 Feb 2016 10:39:27 +0900, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote in <56C27DCF(dot)7020705(at)lab(dot)ntt(dot)co(dot)jp>
>
> Hello,
>
> On 2016/02/15 20:21, Kyotaro HORIGUCHI wrote:
> > At Mon, 8 Feb 2016 11:37:17 +0900, Amit Langote wrote:
> >> On 2016/02/05 17:15, pokurev(at)pm(dot)nttdata(dot)co(dot)jp wrote:
> >>> Please find attached updated patch.
>
> [ ... ]
>
> >>
> >> Instead of passing the array of char *'s, why not just pass a single char
> >> *, because that's what it's doing - updating a single message. So,
> >> something like:
> >
> > As I might have written upthread, transferring the whole string
> > as a progress message is useless at least in this scenario. Since
> > they are a set of fixed messages, each of them can be represented
> > by an identifier, an integer number. I don't see a reason for
> > sending the whole of a string beyond a backend.
>
> This tends to make sense. Perhaps, they could be macros:
>
> #define VACUUM_PHASE_SCAN_HEAP 1
> #define VACUUM_PHASE_VACUUM_INDEX_HEAP 2
Exactly. Or an enum.
> > Next, the function pg_stat_get_command_progress() has a somewhat
> > generic name, but it seems to reuturn the data only for the
> > backends with beentry->st_command = COMMAND_LAZY_VACUUM and has
> > the column names specific for vucuum like process. If the
> > function is intended to be generic, it might be better to return
> > a set of integer[] for given type. Otherwise it should have a
> > name represents its objective.
>
> Agreed.
>
> >
> > CREATE FUNCTION
> > pg_stat_get_command_progress(IN cmdtype integer)
> > RETURNS SETOF integer[] as $$....
> >
> > SELECT * from pg_stat_get_command_progress(PROGRESS_COMMAND_VACUUM) as x
> > x
> > ---------------------_
> > {1233, 16233, 1, ....}
> > {3244, 16236, 2, ....}
> > ....
>
> I am not sure what we would pass as argument to the (SQL) function
> pg_stat_get_command_progress() in the system view definition for
> individual commands - what is PROGRESS_COMMAND_VACUUM exactly? Would
> string literals like "vacuum", "cluster", etc. to represent command names
> work?
Sorry, it is a symbol to tell pg_stat_get_command_progress() to
return stats numbers of backends running VACUUM. It should have
been COMMAND_LAZY_VACUUM for this patch. If we want progress of
CREATE INDEX, it would be COMMAND_CREATE_INDEX.
> >
> > CREATE VIEW pg_stat_vacuum_progress AS
> > SELECT S.s[1] as pid,
> > S.s[2] as relid,
> > CASE S.s[3]
> > WHEN 1 THEN 'Scanning Heap'
> > WHEN 2 THEN 'Vacuuming Index and Heap'
> > ELSE 'Unknown phase'
> > END,
> > ....
> > FROM pg_stat_get_command_progress(PROGRESS_COMMAND_VACUUM) as S;
> >
> > # The name of the function could be other than *_command_progress.
> >
> > Any thoughts or opinions?
>
> How about pg_stat_get_progress_info()?
I think it's good.
> >> + snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s", phase2);
> >> + pgstat_report_progress_update_message(0, progress_message);
> >> /* Remove index entries */
> >> for (i = 0; i < nindexes; i++)
> >> + {
> >> lazy_vacuum_index(Irel[i],
> >> &indstats[i],
> >> vacrelstats);
> >> + scanned_index_pages += RelationGetNumberOfBlocks(Irel[i]);
> >> + /* Update the scanned index pages and number of index scan */
> >> + pgstat_report_progress_update_counter(3, scanned_index_pages);
> >> + pgstat_report_progress_update_counter(4, vacrelstats->num_index_scans
> >> + 1);
> >> + }
> >> /* Remove tuples from heap */
> >> lazy_vacuum_heap(onerel, vacrelstats);
> >> vacrelstats->num_index_scans++;
> >> + scanned_index_pages = 0;
> >>
> >> I guess num_index_scans could better be reported after all the indexes are
> >> done, that is, after the for loop ends.
> >
> > Precise reporting would be valuable if vacuuming indexes takes a
> > long time. It seems to me to be fine as it is since updating of
> > stat counters wouldn't add any significant overhead.
>
> Sorry, my comment may be a bit unclear. vacrelstats->num_index_scans
> doesn't count individual indexes vacuumed but rather the number of times
> "all" the indexes of a table are vacuumed, IOW, the number of times the
> vacuum phase runs. Purpose of counter #4 there seems to be to report the
> latter. OTOH, reporting scanned_index_pages per index as is done in the
> patch is alright.
I got it. Sorry for my misreading. Yes, you're
right. index_scan_count can take atmost 1 by the code. That's
odd.
> That said, there is discussion upthread about more precise reporting on
> index vacuuming by utilizing the lazy_tid_reaped() (the index bulk delete
> callback) as a place where we can report what index block number we are
> at. I think that would mean the current IndexBulkDeleteCallback signature
> is insufficient, which is the following:
>
> /* Typedef for callback function to determine if a tuple is bulk-deletable */
> typedef bool (*IndexBulkDeleteCallback) (ItemPointer itemptr, void *state);
>
> One more parameter would be necessary:
>
> typedef bool (*IndexBulkDeleteCallback) (ItemPointer itemptr, BlockNumber
> current_index_blkno, void *state);
It could work for btree but doesn't for, for example,
gin. ginbulkdelete finds the next page in the following way.
> blkno = GinPageGetOpaque(page)->rightlink;
We should use another value to fagure the progress. If the
callback is called centainly the same or around the same number
of times with the total page numbers, the callback should just
increment a static counter for processed pages.
> That would also require changing all the am specific vacuumpage routines
> (like btvacuumpage) to also pass the new argument. Needless to say, some
> bookkeeping information would also need to be kept in LVRelStats (the
> "state" in above signature).
>
> Am I missing something?
So, maybe missing the case of other than btree..
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro HORIGUCHI | 2016-02-16 09:47:55 | Identifying a message in emit_log_hook. |
Previous Message | Fabien COELHO | 2016-02-16 08:53:59 | Re: extend pgbench expressions with functions |