RE: non-HOT update not looking at FSM for large tuple update

From: Floris Van Nee <florisvannee(at)Optiver(dot)com>
To: Floris Van Nee <florisvannee(at)Optiver(dot)com>, John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: non-HOT update not looking at FSM for large tuple update
Date: 2021-02-24 20:52:35
Message-ID: f5803d34810c4026a628121dd699d515@opammb0562.comp.optiver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> In this case, the vast majority has between 8050-8099 bytes free according to the FSM. That means that, for this particular case, for a fillfactor of 10, we’d need to subtract ~120 bytes or so in order to properly recycle the pages.

Also, I think this "fudge" factor would need to be defined as a percentage of the page size as well. 100 bytes on an 8kB page is quite different than 100 bytes on a 1kB page (although I have no idea if people ever actually compile PG with a different page size, but it is supposed to be supported?).

I also understand the temptation to define it based on the relation's fill factor, as you did in the patch. However, upon some further thought I wonder if that's useful? A relation with a higher fill factor will have a lower 'saveFreeSpace' variable, so it's less likely to run into issues in finding a fresh page, except if the tuple you're inserting/updating is even larger. However, if that case happens, you'll still be wanting to look for a page that's completely empty (except for the line items). So the only proper metric is 'how many unused line items do we expect on empty pages' and the fillfactor doesn't say much about this. Since this is probably difficult to estimate at all, we may be better off just defining it off MaxHeapTupleSize completely?
For example, we expect 1.5% of the page could be line items, then:

targetFreeSpace = MaxHeapTupleSize * 0.985

-Floris

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2021-02-24 21:03:16 Re: Asynchronous and "direct" IO support for PostgreSQL.
Previous Message Dmitry Dolgov 2021-02-24 20:41:16 Re: Asynchronous and "direct" IO support for PostgreSQL.