Quick Links

Re: Avoiding superfluous buffer locking during nbtree backwards scans

From:	Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>
To:	Peter Geoghegan <pg(at)bowt(dot)ie>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>
Subject:	Re: Avoiding superfluous buffer locking during nbtree backwards scans
Date:	2024-11-07 10:44:24
Message-ID:	f8efb9c0f8d1a71b44fd7f8e42e49c25@oss.nttdata.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi, thanks for working on these improvements.

I noticed an unexpected behavior where the index scan continues instead
of
stopping, even when it detects that there are no tuples that match the
conditions.
(I observed this while reviewing the skip scan patch, though it isn't
directly
related to this issue.)

On 2024-10-12 08:29, Peter Geoghegan wrote:
> On Thu, Oct 10, 2024 at 1:41 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> * We now reset currPos state (including even its moreLeft/moreRight
> fields) within _bt_parallel_seize, automatically and regardless of any
> other details.

IIUC, the above change is the root cause. The commit 1bd4bc8 adds a
reset of
the currPos state in _bt_parallel_seize(). However, this change can
overwrite
currPos.moreRight which should be preserved before calling
_bt_readnextpage().

* Test case

-- Prepare
DROP TABLE IF EXISTS test;
CREATE TABLE test (smallint smallint, bool bool);
INSERT INTO test (SELECT -20000+i%40000, random()>0.5 FROM
generate_series(1, 1_000_000) s(i));
CREATE INDEX test_smallint ON test (smallint);
VACUUM ANALYZE test;

-- Check the number of pages of the index
=# SELECT relpages FROM pg_class WHERE relname = 'test_smallint';
relpages
----------
937
(1 row)

-- Test
=# SET max_parallel_workers = 0;
=# EXPLAIN (ANALYZE, BUFFERS, VERBOSE) SELECT COUNT(*) FROM test WHERE
smallint < -10000;

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=5170.23..5170.24 rows=1 width=8) (actual
time=71.352..71.402 rows=1 loops=1)
Output: count(*)
Buffers: shared hit=934
-> Gather (cost=5170.01..5170.22 rows=2 width=8) (actual
time=71.344..71.395 rows=1 loops=1)
Output: (PARTIAL count(*))
Workers Planned: 2
Workers Launched: 0
Buffers: shared hit=934
-> Partial Aggregate (cost=4170.01..4170.02 rows=1 width=8)
(actual time=71.199..71.199 rows=1 loops=1)
Output: PARTIAL count(*)
Buffers: shared hit=934
-> Parallel Index Only Scan using test_smallint on
public.test (cost=0.42..3906.27 rows=105495 width=0) (actual
time=0.062..49.137 rows=250000 loops=1)
Output: "smallint"
Index Cond: (test."smallint" < '-10000'::integer)
Heap Fetches: 0
Buffers: shared hit=934 -- This is not the result
I expected. Almost all relpages are being read to retrieve only 25% of
the tuples.
-- Without commit 1bd4bc8,
the number was '236' in my environment.
Planning Time: 0.105 ms
Execution Time: 71.454 ms
(18 rows)

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

In response to

Re: Avoiding superfluous buffer locking during nbtree backwards scans at 2024-10-11 23:29:12 from Peter Geoghegan

Responses

Re: Avoiding superfluous buffer locking during nbtree backwards scans at 2024-11-07 19:19:32 from Peter Geoghegan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alena Rybakina	2024-11-07 11:06:37	Re: Incremental Sort Cost Estimation Instability
Previous Message	Amit Kapila	2024-11-07 10:31:26	Re: Disallow UPDATE/DELETE on table with unpublished generated column as REPLICA IDENTITY