From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> |
Cc: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Álvaro Herrera <alvherre(at)2ndquadrant(dot)com>, rahilasyed90(at)gmail(dot)com, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Masao Fujii <masao(dot)fujii(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, pokurev(at)pm(dot)nttdata(dot)co(dot)jp, Vinayak Pokale <vinpokale(at)gmail(dot)com> |
Subject: | Re: [PROPOSAL] VACUUM Progress Checker. |
Date: | 2015-12-09 19:40:10 |
Message-ID: | CA+TgmoZ5q4N4T0c0_-XKTencEWOAbfdKtoPPT8NUjjjV5OHMFQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Dec 7, 2015 at 2:41 AM, Amit Langote
<Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> On 2015/12/03 19:05, Kyotaro HORIGUCHI wrote:
>> At Thu, 3 Dec 2015 16:24:32 +0900, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote
>>> By the way, there are some non-st_progress_* fields that I was talking
>>> about in my previous message. For example, st_activity_flag (which I have
>>> suggested to rename to st_command instead). It needs to be set once at the
>>> beginning of the command processing and need not be touched again. I think
>>> it may be a better idea to do the same for table name or OID. It also
>>> won't change over the duration of the command execution. So, we could set
>>> it once at the beginning where that becomes known. Currently in the patch,
>>> it's reported in st_progress_message[0] by every pgstat_report_progress()
>>> invocation. So, the table name will be strcpy()'d to shared memory for
>>> every scanned block that's reported.
>>
>> If we don't have dedicate reporting functions for each phase
>> (that is, static data phase and progress phase), the one function
>> pgstat_report_progress does that by having some instruction from
>> the caller and it would be a bitfield.
>>
>> Reading the flags for making decision whether every field to copy
>> or not and branching by that seems too much for int32 (or maybe
>> 64?) fields. Alhtough it would be in effective when we have
>> progress_message fields, I don't think it is a good idea without
>> having progress_message.
>>
>>> pgstat_report_progress(uint32 *param1, uint16 param1_bitmap,
>>> char param2[][..], uint16 param2_bitmap)
>>> {
>>> ...
>>> for(i = 0; i < 16 ; i++)
>>> {
>>> if (param1_bitmap & (1 << i))
>>> beentry->st_progress_param[i] = param1[i];
>>> if (param2_bitmap & (1 << i))
>>> strcpy(beentry->..., param2[i]);
>>> }
>>
>> Thoughts?
>
> Hm, I guess progress messages would not change that much over the course
> of a long-running command. How about we pass only the array(s) that we
> change and NULL for others:
>
> With the following prototype for pgstat_report_progress:
>
> void pgstat_report_progress(uint32 *uint32_param, int num_uint32_param,
> bool *message_param[], int num_message_param);
>
> If we just wanted to change, say scanned_heap_pages, then we report that
> using:
>
> uint32_param[1] = scanned_heap_pages;
> pgstat_report_progress(uint32_param, 3, NULL, 0);
>
> Also, pgstat_report_progress() should check each of its parameters for
> NULL before iterating over to copy. So in most reports over the course of
> a command, the message_param would be NULL and hence not copied.
It's going to be *really* important that this facility provides a
lightweight way of updating progress, so I think this whole API is
badly designed. VACUUM, for example, is going to want a way to update
the individual counter for the number of pages it's scanned after
every page. It should not have to pass all of the other information
that is part of a complete report. It should just be able to say
pgstat_report_progress_update_counter(1, pages_scanned) or something
of this sort. Don't marshal all of the data and then make
pgstat_report_progress figure out what's changed. Provide a series of
narrow APIs where the caller tells you exactly what they want to
update, and you update only that. It's fine to have a
pgstat_report_progress() function to update everything at once, for
the use at the beginning of a command, but the incremental updates
within the command should do something lighter-weight.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2015-12-09 21:11:14 | Re: [DOCS] max_worker_processes on the standby |
Previous Message | Robert Haas | 2015-12-09 19:31:35 | Re: Re: Reusing abbreviated keys during second pass of ordered [set] aggregates |