Re: Eager page freeze criteria clarification

From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com>
Subject: Re: Eager page freeze criteria clarification
Date: 2023-09-08 05:45:59
Message-ID: 20230908054559.ddweefvhuyjmwjz4@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-09-07 22:29:04 -0700, Peter Geoghegan wrote:
> On Thu, Sep 7, 2023 at 9:45 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > I.e. setting an, otherwise unmodified, page all-visible won't trigger an FPI
> > if checksums are disabled, but will FPI with checksums enabled. I think that's
> > a substantial difference in WAL volume for insert-only workloads...
>
> Note that all RDS Postgres users get page-level checksums. Overall,
> the FPI trigger mechanism is going to noticeably improve performance
> characteristics for many users. Simple and crude though it is.

You mean the current heuristic or some new heuristic we're coming up with in
this thread? If the former, unfortunately I think the current heuristic often
won't trigger in cases where freezing would be fine, e.g. on an insert-mostly
(or hot pruned) workload with some read accesses. If the tuples are already
hint-bitted, there's no FPI during heap_page_prune(), and thus we don't freeze
- even though we *do* subsequently trigger an FPI, while setting all-visible.

See e.g. the stats for the modified version of the scenario, where there the
table is hint-bitted that I have since posted:
https://postgr.es/m/20230908053634.hyn46pugqp4lsiw5%40awork3.anarazel.de

There we freeze neither with nor without checksums, despite incurring FPIs
when checksums are enabled.

> > Type N (%) Record size (%) FPI size (%) Combined size (%)
> > ---- - --- ----------- --- -------- --- ------------- ---
> > XLOG/FPI_FOR_HINT 44253 ( 33.34) 2168397 ( 7.84) 361094232 (100.00) 363262629 ( 93.44)
> > Transaction/INVALIDATION 1 ( 0.00) 78 ( 0.00) 0 ( 0.00) 78 ( 0.00)
> > Standby/INVALIDATIONS 1 ( 0.00) 90 ( 0.00) 0 ( 0.00) 90 ( 0.00)
> > Heap2/FREEZE_PAGE 44248 ( 33.33) 22876120 ( 82.72) 0 ( 0.00) 22876120 ( 5.88)
> > Heap2/VISIBLE 44248 ( 33.33) 2610642 ( 9.44) 16384 ( 0.00) 2627026 ( 0.68)
> > Heap/INPLACE 1 ( 0.00) 188 ( 0.00) 0 ( 0.00) 188 ( 0.00)
> > -------- -------- -------- --------
> > Total 132752 27655515 [7.11%] 361110616 [92.89%] 388766131 [100%]
> >
> > In realistic tables, where rows are wider than a single int, FPI_FOR_HINT
> > dominates even further, as the FREEZE_PAGE would be smaller if there weren't
> > 226 tuples on each page...
>
> If FREEZE_PAGE WAL volume is really what holds back further high level
> improvements in this area, then it can be worked on directly -- it's
> not a fixed cost. It wouldn't be particularly difficult, in fact.

Agreed!

> These are records that still mostly consist of long runs of contiguous
> page offset numbers. They're ideally suited for compression using some
> kind of simple variant of run length encoding. The freeze plan
> deduplication stuff in 16 made a big difference, but it's still not
> very hard to improve matters here.

Yea, even just using ranges of offsets should help in a lot of cases.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jingtang Zhang 2023-09-08 05:54:51 Re: Suspicious redundant assignment in COPY FROM
Previous Message Michael Paquier 2023-09-08 05:41:54 Re: Suspicious redundant assignment in COPY FROM