Re: log_autovacuum in Postgres 14 -- ordering issue

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>
Subject: Re: log_autovacuum in Postgres 14 -- ordering issue
Date: 2021-08-26 00:02:21
Message-ID: CAH2-WzkkGT2Gt4XauS5eQOQi4mVvL5X49hBTtWccC8DEqeNfKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 25, 2021 at 2:06 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> You mean:
>
> LOG: automatic vacuum of table "regression.public.bmsql_order_line": index scans: 1
> pages: 0 removed, 8810377 remain, 0 skipped due to pins, 3044924 frozen
> tuples: 16819838 removed, 576364686 remain, 2207444 are dead but not yet removable, oldest xmin: 88197949
> index scan needed: 1959301 pages from table (22.24% of total) had 11745226 dead item identifiers removed
> index "bmsql_order_line_pkey": pages: 2380261 in total, 0 newly deleted, 0 currently deleted, 0 reusable
> I/O timings: read: 65813.666 ms, write: 11310.689 ms
> avg read rate: 65.621 MB/s, avg write rate: 48.403 MB/s
> buffer usage: 174505 hits, 7630386 misses, 5628271 dirtied
> WAL usage: 7387358 records, 4051205 full page images, 28472185998 bytes
> system usage: CPU: user: 72.55 s, system: 52.07 s, elapsed: 908.42 s

Yes, exactly.

> I like it better than the current layout, so +1.

This seems like a release housekeeping task to me. I'll come up with
a patch targeting 14 and master in a few days.

> I think the "index scan needed" line (introduced very late in the 14
> cycle, commit 5100010ee4d5 dated April 7 2021) is a bit odd.

But that's largely a reflection of what's going on here.

> It is
> telling us stuff about the table -- how many pages had TIDs removed, am
> I reading that right? -- and it is also telling us whether indexes were
> scanned. But the fact that it starts with "index scan needed" suggests
> that it's talking about indexes.

The question of whether or not we do an index scan (i.e. index
vacuuming) depends entirely on the number of LP_DEAD items that heap
pruning left behind in the table structure. Actually, sometimes it's
~100% opportunistic pruning that happens to run outside of VACUUM (in
which case VACUUM merely notices and collects TIDs to delete from
indexes) -- it depends entirely on the workload. This isn't a new
thing added in commit 5100010ee4d5, really. That commit merely made
the index-bypass behavior not only occur when we had precisely 0 items
to delete from indexes -- now it can be skipped when the percentage of
heap pages with one or more LP_DEAD items is < 2%. So yes: this "pages
from table" output *is* primarily concerned with what happened with
indexes, even though the main piece of information says something
about the heap/table structure.

Note that in general a table could easily have many many more "tuples:
N removed" than "N dead item identifiers removed" in its
log_autovacuum output -- this is very common (any table that mostly or
only gets HOT updates and no deletes will look like that). The
opposite situation is also possible, and almost as common with tables
that only get non-HOT updates. The BenchmarkSQL TPC-C implementation
has tables in both categories -- it does tend to be a stable thing for
a table, in general.

Here is the second largest BenchmarkSQL table (this is just a random
VACUUM operation from logs used by a recent benchmark of mine):

automatic aggressive vacuum of table "regression.public.bmsql_oorder":
index scans: 1
pages: 0 removed, 943785 remain, 6 skipped due to pins, 205851 skipped frozen
tuples: 63649 removed, 105630136 remain, 2785 are dead but not yet
removable, oldest xmin: 186094041
buffer usage: 2660543 hits, 1766591 misses, 1375104 dirtied
index scan needed: 219092 pages from table (23.21% of total) had
14946563 dead item identifiers removed
index "bmsql_oorder_pkey": pages: 615866 in total, 0 newly deleted, 0
currently deleted, 0 reusable
index "bmsql_oorder_idx1": pages: 797957 in total, 131608 newly
deleted, 131608 currently deleted, 131608 reusable
avg read rate: 33.933 MB/s, avg write rate: 26.413 MB/s
I/O timings: read: 105551.978 ms, write: 16538.690 ms
system usage: CPU: user: 79.71 s, system: 49.74 s, elapsed: 406.73 s
WAL usage: 1934993 records, 1044051 full page images, 7076820876 bytes

On Postgres 13 you'd only see "tuples: 63649 removed" here. You'd
never see anything like "14946563 dead item identifiers removed", even
though that's obviously hugely important (more important than "tuples
removed", even). A user could be forgiven for thinking that HOT must
hurt performance! So yes, I agree. This *is* a bit odd.

(Another problem here is that "205851 skipped frozen" only counts
those heap pages that were specifically skipped frozen, even for a
non-aggressive VACUUM.)

> I think we should reword this line. I
> don't have any great ideas; what do you think of this?
>
> dead items: 1959301 pages from table (22.24% of total) had 11745226 dead item identifiers removed; index scan {needed, not needed, bypassed, bypassed by failsafe}
>
> I have to say that I am a bit bothered about the coding pattern used to
> build this sentence from two parts. I'm not sure it'll work okay in
> languages that build sentences in different ways. Maybe we should split
> this in two lines, one to give the numbers and the other to talk about
> the decision taken about indexes.

I'm happy to work with you to make the message more translatable. But
it's not easy. I personally believe that this kind of information is
generally only valuable in some specific context. Usually the rate of
change over time is a big part of what is truly interesting.
Recognizable correlations with good or bad performance (perhaps
determined at some much higher level of the user's stack) are also
important.

For example, here is what BenchmarkSQL shows for the first few VACUUMs
for its new order table, which is supposed to have a more or less
fixed size (but actually doesn't right now):

index scan needed: 7810 pages from table (15.28% of total) had 452785
dead item identifiers removed
...
index scan needed: 8482 pages from table (16.60% of total) had 456030
dead item identifiers removed
...
index scan needed: 8811 pages from table (17.24% of total) had 454976
dead item identifiers removed

These 3 VACUUMs are all within an hour of each other -- the percentage
here slowly climbs over many hours. Because of heap fragmentation,
this percentage never stops growing -- though it will take maybe 12+
hours for it to saturate at ~99.5%. Obviously it's hard to explain
this stuff in a way that will clearly generalize to many different
situations. At the same time I believe that many DBAs will find these
details very useful. Even when they have a flawed understanding of
what each item truly means. They're mostly looking at patterns,
trends.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message alvherre@alvh.no-ip.org 2021-08-26 00:20:04 Re: prevent immature WAL streaming
Previous Message alvherre@alvh.no-ip.org 2021-08-25 23:29:54 Re: prevent immature WAL streaming