Re: Parallel heap vacuum

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Melanie Plageman <melanieplageman(at)gmail(dot)com>, John Naylor <johncnaylorls(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel heap vacuum
Date: 2025-03-10 07:46:13
Message-ID: CAHut+Pu5a-YoXxOJW+mixkRXiOuGfH=Y25TwEoH5YjUJ=YdLnQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Sawada-San.

Some review comments for patch v11-0003.

======
src/backend/access/heap/vacuumlazy.c

1.
+typedef struct LVScanData
+{
+ BlockNumber rel_pages; /* total number of pages */
+
+ BlockNumber scanned_pages; /* # pages examined (not skipped via VM) */
+
+ /*
+ * Count of all-visible blocks eagerly scanned (for logging only). This
+ * does not include skippable blocks scanned due to SKIP_PAGES_THRESHOLD.
+ */
+ BlockNumber eager_scanned_pages;
+
+ BlockNumber removed_pages; /* # pages removed by relation truncation */
+ BlockNumber new_frozen_tuple_pages; /* # pages with newly frozen tuples */
+
+
+ /* # pages newly set all-visible in the VM */
+ BlockNumber vm_new_visible_pages;
+
+ /*
+ * # pages newly set all-visible and all-frozen in the VM. This is a
+ * subset of vm_new_visible_pages. That is, vm_new_visible_pages includes
+ * all pages set all-visible, but vm_new_visible_frozen_pages includes
+ * only those which were also set all-frozen.
+ */
+ BlockNumber vm_new_visible_frozen_pages;
+
+ /* # all-visible pages newly set all-frozen in the VM */
+ BlockNumber vm_new_frozen_pages;
+
+ BlockNumber lpdead_item_pages; /* # pages with LP_DEAD items */
+ BlockNumber missed_dead_pages; /* # pages with missed dead tuples */
+ BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
+
+ /* Counters that follow are only for scanned_pages */
+ int64 tuples_deleted; /* # deleted from table */
+ int64 tuples_frozen; /* # newly frozen */
+ int64 lpdead_items; /* # deleted from indexes */
+ int64 live_tuples; /* # live tuples remaining */
+ int64 recently_dead_tuples; /* # dead, but not yet removable */
+ int64 missed_dead_tuples; /* # removable, but not removed */
+
+ /* Tracks oldest extant XID/MXID for setting relfrozenxid/relminmxid. */
+ TransactionId NewRelfrozenXid;
+ MultiXactId NewRelminMxid;
+ bool skippedallvis;
+} LVScanData;
+

1a.
Double blank line after 'new_frozen_tuple_pages' field.

~

1b.
I know you have only refactored existing code but IMO, the use of "#"
meaning "Number of" or "Count of" doesn't seem nice. It *may* be
justifiable to prevent wrapping of the short comments on the same line
as the fields, but for all the other larger comments it looks strange
not to be spelled out properly as words.

~~~

heap_vacuum_rel:

2.
/* Initialize page counters explicitly (be tidy) */
- vacrel->scanned_pages = 0;
- vacrel->eager_scanned_pages = 0;
- vacrel->removed_pages = 0;
- vacrel->new_frozen_tuple_pages = 0;
- vacrel->lpdead_item_pages = 0;
- vacrel->missed_dead_pages = 0;
- vacrel->nonempty_pages = 0;
+ scan_data = palloc(sizeof(LVScanData));
+ scan_data->scanned_pages = 0;
+ scan_data->eager_scanned_pages = 0;
+ scan_data->removed_pages = 0;
+ scan_data->new_frozen_tuple_pages = 0;
+ scan_data->lpdead_item_pages = 0;
+ scan_data->missed_dead_pages = 0;
+ scan_data->nonempty_pages = 0;
+ scan_data->tuples_deleted = 0;
+ scan_data->tuples_frozen = 0;
+ scan_data->lpdead_items = 0;
+ scan_data->live_tuples = 0;
+ scan_data->recently_dead_tuples = 0;
+ scan_data->missed_dead_tuples = 0;
+ scan_data->vm_new_visible_pages = 0;
+ scan_data->vm_new_visible_frozen_pages = 0;
+ scan_data->vm_new_frozen_pages = 0;
+ scan_data->rel_pages = orig_rel_pages = RelationGetNumberOfBlocks(rel);
+ vacrel->scan_data = scan_data;

Or, you could replace most of these assignments by using palloc0, and
write a comment to say all the counters were zapped to 0.

~~~

3.
+ scan_data->vm_new_frozen_pages = 0;
+ scan_data->rel_pages = orig_rel_pages = RelationGetNumberOfBlocks(rel);
+ vacrel->scan_data = scan_data;
/* dead_items_alloc allocates vacrel->dead_items later on */

That comment about dead_items_alloc seems misplaced now because it is
grouped with all the scan_data stuff.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-03-10 07:46:53 Re: per backend WAL statistics
Previous Message Aleksander Alekseev 2025-03-10 07:40:01 [PATCH] Add reverse(bytea)