Re: pgstattuple: fix free space calculation

From: Frédéric Yhuel <frederic(dot)yhuel(at)dalibo(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>
Cc: Andreas Karlsson <andreas(at)proxel(dot)se>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgstattuple: fix free space calculation
Date: 2024-09-09 13:47:49
Message-ID: ea72b283-d11b-49f4-ac96-b6f146249645@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tom, thanks for your review.

On 9/7/24 22:10, Tom Lane wrote:
> I looked at this patch. I agree with making the change. However,
> I don't agree with the CF entry's marking of "target version: stable"
> (i.e., requesting back-patch). I think this falls somewhere in the
> gray area between a bug fix and a definitional change. Also, people
> are unlikely to be happy if they suddenly get new, not-comparable
> numbers after a minor version update. So I think we should just fix
> it in HEAD.
>

OK, I did the change.

> As far as the patch itself goes, the one thing that is bothering me
> is this comment change
>
> /*
> - * It's not safe to call PageGetHeapFreeSpace() on new pages, so we
> + * It's not safe to call PageGetExactFreeSpace() on new pages, so we
> * treat them as being free space for our purposes.
> */
>
> which looks like it wasn't made with a great deal of thought.
> Now it seems to me that the comment was already bogus when written:
> there isn't anything uncertain about what will happen if you call
> either of these functions on a "new" page. PageIsNew checks for
>
> return ((PageHeader) page)->pd_upper == 0;
>
> If pd_upper is 0, PageGet[Exact]FreeSpace is absolutely guaranteed
> to return zero, even if pd_lower contains garbage. And then

Indeed. I failed to notice that LocationIndex was an unsigned int, so I
thought that pg_upper - pd_upper could be positive with garbage in pg_upper.

> PageGetHeapFreeSpace will likewise return zero. Perhaps there
> could be trouble if we got into the line-pointer-checking part
> of PageGetHeapFreeSpace, but we can't. So this comment is wrong,
> and is even more obviously wrong after the above change. I thought
> for a moment about removing the PageIsNew test altogether, but
> then I decided that it probably*is* what we want and is just
> mis-explained. I think the comment should read more like
>
> /*
> * PageGetExactFreeSpace() will return zero for a "new" page,
> * but it's actually usable free space, so count it that way.
> */
>
> Now alternatively you could argue that a "new" page isn't usable free
> space yet and so we should count it as zero, just as we don't count
> dead tuples as usable free space. You need VACUUM to turn either of
> those things into real free space. But that'd be a bigger definitional
> change, and I'm not sure we want it. Thoughts?
>
> Also, do we need any documentation change for this? I looked through
> https://www.postgresql.org/docs/devel/pgstattuple.html
> and didn't see anything that was being very specific about what
> "free space" means, so maybe it's fine as-is.

It's not easy. Maybe something like this?

"For any initialized page, free space refers to anything that isn't page
metadata (header and special), a line pointer or a tuple pointed to by a
valid line pointer. In particular, a dead tuple is not free space
because there's still a valid line pointer pointer pointing to it, until
VACUUM or some other maintenance mechanism (e.g. page pruning) cleans up
the page. A dead line pointer is not free space either, but the tuple it
points to has become free space. An unused line pointer could be
considered free space, but pgstattuple doesn't take it into account."

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Frédéric Yhuel 2024-09-09 13:49:54 Re: pgstattuple: fix free space calculation
Previous Message Peter Eisentraut 2024-09-09 13:37:23 Re: tiny step toward threading: reduce dependence on setlocale()