From: | Tatsuro Yamada <yamada(dot)tatsuro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
Subject: | Re: [HACKERS] CLUSTER command progress monitor |
Date: | 2019-03-06 06:38:54 |
Message-ID: | 03cc5c0e-243c-e4a0-c5cf-a1f8380ca530@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2019/03/05 17:56, Tatsuro Yamada wrote:
> Hi Robert!
>
> On 2019/03/05 11:35, Robert Haas wrote:
>> On Mon, Mar 4, 2019 at 5:38 AM Tatsuro Yamada
>> <yamada(dot)tatsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>>> === Current design ===
>>>
>>> CLUSTER command uses Index Scan or Seq Scan when scanning the heap.
>>> Depending on which one is chosen, the command will proceed in the
>>> following sequence of phases:
>>>
>>> * Scan method: Seq Scan
>>> 0. initializing (*2)
>>> 1. seq scanning heap (*1)
>>> 3. sorting tuples (*2)
>>> 4. writing new heap (*1)
>>> 5. swapping relation files (*2)
>>> 6. rebuilding index (*2)
>>> 7. performing final cleanup (*2)
>>>
>>> * Scan method: Index Scan
>>> 0. initializing (*2)
>>> 2. index scanning heap (*1)
>>> 5. swapping relation files (*2)
>>> 6. rebuilding index (*2)
>>> 7. performing final cleanup (*2)
>>>
>>> VACUUM FULL command will proceed in the following sequence of phases:
>>>
>>> 1. seq scanning heap (*1)
>>> 5. swapping relation files (*2)
>>> 6. rebuilding index (*2)
>>> 7. performing final cleanup (*2)
>>>
>>> (*1): increasing the value in heap_tuples_scanned column
>>> (*2): only shows the phase in the phase column
>>
>> All of that sounds good.
>>
>>> The view provides the information of CLUSTER command progress details as follows
>>> # \d pg_stat_progress_cluster
>>> View "pg_catalog.pg_stat_progress_cluster"
>>> Column | Type | Collation | Nullable | Default
>>> ---------------------------+---------+-----------+----------+---------
>>> pid | integer | | |
>>> datid | oid | | |
>>> datname | name | | |
>>> relid | oid | | |
>>> command | text | | |
>>> phase | text | | |
>>> cluster_index_relid | bigint | | |
>>> heap_tuples_scanned | bigint | | |
>>> heap_tuples_vacuumed | bigint | | |
>>
>> Still not sure if we need heap_tuples_vacuumed. We could try to
>> report heap_blks_scanned and heap_blks_total like we do for VACUUM, if
>> we're using a Seq Scan.
>
> I have no strong opinion to add heap_tuples_vacuumed, so I'll remove that in
> next patch.
>
> Regarding heap_blks_scanned and heap_blks_total, I suppose that it is able to
> get those from initscan(). I'll investigate it more.
>
> cluster.c
> copy_heap_data()
> heap_beginscan()
> heap_beginscan_internal()
> initscan()
>
>
>
>>> === Discussion points ===
>>>
>>> - Progress counter for "3. sorting tuples" phase
>>> - Should we add pgstat_progress_update_param() in tuplesort.c like a
>>> "trace_sort"?
>>> Thanks to Peter Geoghegan for the useful advice!
>>
>> How would we avoid an abstraction violation?
>
> Hmm... What do you mean an abstraction violation?
> If it is difficult to solve, I'd not like to add the progress counter for the sorting tuples.
>
>
>>> - Progress counter for "6. rebuilding index" phase
>>> - Should we add "index_vacuum_count" in the view like a vacuum progress monitor?
>>> If yes, I'll add pgstat_progress_update_param() to reindex_relation() of index.c.
>>> However, I'm not sure whether it is okay or not.
>>
>> Doesn't seem unreasonable to me.
>
> I see, I'll add it later.
Attached file is revised and WIP patch including:
- Remove heap_tuples_vacuumed
- Add heap_blks_scanned and heap_blks_total
- Add index_vacuum_count
I tried to "add heap_blks_scanned and heap_blks_total" columns and I realized that
"heap_tuples_scanned" column is suitable as a counter when a scan method is
both index-scan and seq-scan because CLUSTER is on a tuple basis.
Regards,
Tatsuro Yamada
Attachment | Content-Type | Size |
---|---|---|
progress_monitor_for_cluster_command_v8_code.patch | text/x-patch | 13.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2019-03-06 06:45:06 | pg_dump is broken for partition tablespaces |
Previous Message | Amit Langote | 2019-03-06 06:34:12 | Re: Update does not move row across foreign partitions in v11 |