Re: monitoring CREATE INDEX [CONCURRENTLY]

From: Rahila Syed <rahila(dot)syed(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, David Fetter <david(at)fetter(dot)org>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Re: monitoring CREATE INDEX [CONCURRENTLY]
Date: 2019-04-01 20:43:53
Message-ID: CAOajBXRSqA4hDDJRov+gmYxJNocULphv-PO74JRwQEWGrYcjfA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Mon, 1 Apr 2019 at 21:40, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
wrote:

> Hi Rahila, thanks for reviewing.
>
> On 2019-Mar-25, Rahila Syed wrote:
>
> > Please see few comments below:
> >
> > 1. Makecheck fails currently as view definition of expected rules.out
> does
> > not reflect latest changes in progress metrics numbering.
>
> Ouch ... fixed.
>
> > 2. + <entry>
> > I think there is a typo here 's/partitioned/partitioned table/'
>
> Ah, so there is. Fixed.
>
> > 3.
> > I think parallel scan equivalent bpscan->phs_nblocks along with
> > hscan->rs_nblocks should be used similar to startblock computation above.
>
> Hmm, yeah. I think the way things are setup currently it doesn't matter
> much, but it seems fragile to rely on that.
>
>
Thank you for incorporating the review comments.

> I also moved the reporting of total blocks to scan in the initial table
> scan so that it occurs wholly in heapam_index_build_range_scan; I had
> originally put that code in _bt_spools_heapscan, but that was a
> modularity violation I think. (It didn't make a practical difference,
> but it made no sense to have the two cases report the number in wildly
> different locations.) Also added a final nblocks metric update after
> the scan is done.
>
> I think this patch is done.
>

I tested the v8 patch by running plain CREATE INDEX, CIC, and for
partitioned tables
and the results are as expected. Please see few observations below.

1. FWIW, below results for CIC show that blocks_done does not become equal
to blocks_total at the end of the phase or it processes last 800 blocks so
fast that
the update is not visible in less than 1 secs interval.

*Mon Mar 25 11:06:31 IST 2019*
pid | datid | datname | relid | phase |
lockers_total | lockers_done | current_locker_pid | blocks_total |
blocks_done | tuples_total | tuples_done | partitions_total |
partitions_done
-------+-------+----------+-------+----------------------------+---------------+--------------+--------------------+--------------+-------------+--------------+-------------+------------------+-----------------
10630 | 13533 | postgres | 16384 | building index: table scan
| 0 | 0 | 0 | 1293366 |
1292535 | 0 | 0 | 0 | 0
(1 row)

*Mon Mar 25 11:06:31 IST 2019*
pid | datid | datname | relid | phase
| lockers_total | lockers_done | current_locker_pid | blocks_total |
blocks_done | tuples_total | tuples_done | partitions_total |
partitions_done
-------+-------+----------+-------+-----------------------------------------+---------------+--------------+--------------------+--------------+-------------+--------------+-------------+------------------+-----------------
10630 | 13533 | postgres | 16384 | building index: sorting tuples, spool 1
| 0 | 0 | 0 | 0
| 0 | 200000000 | 0 | 0
| 0
(1 row)

2. However in case of partitioned tables, the following difference in
blocks_done versus blocks_total at the end of phase is notably high for the
first partition . Subsequent partitions show negligible difference.
Partition 1:
Mon Mar 25 14:27:57 IST 2019
pid | datid | datname | relid | phase |
lockers_total | lockers_done | current_locker_pid | blocks_total |
blocks_done | tuples_total | tuples_done | partitions_total |
partitions_done
-------+-------+----------+-------+----------------------------+---------------+--------------+--------------------+--------------+-------------+--------------+-------------+------------------+-----------------
10630 | 13533 | postgres | 16394 | building index: table scan
| 0 | 0 | 0 | 381342 |
221233 | 0 | 0 | 3 | 0
(1 row)

Mon Mar 25 14:27:57 IST 2019
pid | datid | datname | relid | phase
| lockers_total | lockers_done | current_locker_pid | blocks_total |
blocks_done | tuples_total | tuples_done | partitions_total |
partitions_done
-------+-------+----------+-------+-----------------------------------------+---------------+--------------+--------------------+--------------+-------------+--------------+-------------+------------------+-----------------
10630 | 13533 | postgres | 16394 | building index: sorting tuples, spool 1
| 0 | 0 | 0 | 0
| 0 | 49999999 | 0 | 3
| 0

The partitions are equal in size and the other two partitions have
blocks_done and blocks_total to be approx. 221233. The blocks_total for
partition 1 is reported higher.

3. Sorry for nitpicking, I think following phase name can be made more
consistent with the others.
The non-am specific phase for scanning a table is named as scan heap while
am-specific one is named as table scan.
Can we use heap for am-specific one as well since heap is used elsewhere in
progress reporting too?

4. - scan = table_beginscan_parallel(btspool->heap,
ParallelTableScanFromBTShared(btshared));
+ scan = table_beginscan_parallel(btspool->heap,
+
ParallelTableScanFromBTShared(btshared));

Is this change required?

Besides the above comments ,the patch looks good to me.

Thank you,
--
Rahila Syed
Performance Engineer
2ndQuadrant
http://www.2ndQuadrant.com <http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message legrand legrand 2019-04-01 20:48:27 Re: DWIM mode for psql
Previous Message Thomas Munro 2019-04-01 20:39:09 Re: C_C_A animal on HEAD gets stuck in initdb