RE: non-HOT update not looking at FSM for large tuple update

From: Floris Van Nee <florisvannee(at)Optiver(dot)com>
To: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: non-HOT update not looking at FSM for large tuple update
Date: 2021-02-24 20:10:23
Message-ID: 3d62ab82730943be86cb8fd15dbc715b@opammb0562.comp.optiver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi John,

> One idea is to take your -50 idea and make it more general and safe, by scaling the fudge factor based on fillfactor, such that if fillfactor is less than 100, the requested freespace is a bit smaller than the max. It's still a bit of a hack, though. I've attached a draft of this idea.

You’re right, that’d work better. Though, one thing I'd forgot to mention earlier is that in the "wild" where this occurred, the UPDATEs with these large tuple sizes are much rarer than UPDATEs with a much smaller tuple size. So this means that in reality, when this happens, the empty pages contain more unused line pointers and we’d need to subtract more bytes in order to find those pages in the fsm.

This is the (partial) output of pg_freespace function, bucketed by free space, for a real-life table with fillfactor=10 under the mixed load that I've described.

│ free │ count │

│ 7750 │ 2003 │

│ 7800 │ 7113 │

│ 7850 │ 1781 │

│ 7900 │ 6803 │

│ 7950 │ 13643 │

│ 8000 │ 64779 │

│ 8050 │ 2469665 │

│ 8100 │ 61869 │

└──────┴─────────┘

(163 rows)

The ‘free’ column is the bucket where the value is the lower limit. So, free=7500 means between 7500-7549 bytes free, and count is the number of pages that have this amount free according to the fsm.

In this case, the vast majority has between 8050-8099 bytes free according to the FSM. That means that, for this particular case, for a fillfactor of 10, we’d need to subtract ~120 bytes or so in order to properly recycle the pages.

-Floris

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Dolgov 2021-02-24 20:41:16 Re: Asynchronous and "direct" IO support for PostgreSQL.
Previous Message Greg Stark 2021-02-24 19:59:19 Re: Asynchronous and "direct" IO support for PostgreSQL.