Re: Use streaming read API in pgstattuple.

From: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
To: Kirill Reshke <reshkekirill(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Use streaming read API in pgstattuple.
Date: 2024-11-29 07:28:20
Message-ID: CAN55FZ0nMY3+vRUGrMri2eESiemiYmX0s2fJgexdJhnzH_H78g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

> Hello! Thank you for taking a peek. Your review comments have been
> corrected. Since my changes were wrong, I honestly don't know why this
> worked in version 1. By a miracle.
>
> As for CI, i rechecked v1:
>
> ```
> db2=#
> select * from pgstathashindex('test_hashidx');
> version | bucket_pages | overflow_pages | bitmap_pages | unused_pages
> | live_items | dead_items | free_percent
> ---------+--------------+----------------+--------------+--------------+------------+------------+--------------
> 4 | 4 | 0 | 1 | 0
> | 0 | 0 | 100
> (1 row)
>
> db2=#
> select * from pgstattuple('test_hashidx');
> ERROR: could not read blocks 6..16 in file "base/16454/16473": read
> only 0 of 90112 bytes
> ```
>
> In v2 this behaves correctly.

Yes, I confirm that it works.

> Should we add `pgstattuple(...)` after `pgstat*type*index(..)`
> everywhere in pgstattuple regression tests?

Not everywhere but at least one pgstattuple(...) call for each index type.

There is one behavior change, it may not be important but I think it
is worth mentioning:

- for (; blkno < nblocks; blkno++)
+ p.last_exclusive = nblocks;
+
+ while (BufferIsValid(buf = read_stream_next_buffer(stream, NULL)))
{
CHECK_FOR_INTERRUPTS();

- pagefn(&stat, rel, blkno, bstrategy);
+ pagefn(&stat, rel, buf);
}
+
+ Assert(read_stream_next_buffer(stream, NULL) == InvalidBuffer);

With this change we assume that if stream returns an invalid buffer
that means stream must be finished. We don't check if the stream
didn't finish but somehow read an invalid buffer. In the upstream
version, if we read an invalid buffer then postgres server will fail.
But in the patched version, the server will continue to run because it
will think that stream has reached to end. This could be happening for
other streaming read users; for example at least the
read_stream_next_buffer() calls in the collect_corrupt_items()
function face the same issue.

--
Regards,
Nazir Bilal Yavuz
Microsoft

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2024-11-29 07:28:59 Re: Remove useless GROUP BY columns considering unique index
Previous Message Amit Langote 2024-11-29 07:26:16 Re: doc fail about ALTER TABLE ATTACH re. NO INHERIT