Omitting relpages for toast table access not expected

From: Stefan Litsche <slitsche(at)biosoft(dot)de>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Omitting relpages for toast table access not expected
Date: 2024-07-05 14:41:41
Message-ID: CA+bTu2RaUzKUqEdd6pPchW3tKvNcfbS6d7ADp8NUuuj871sK_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hello PostgreSQL team,

first and foremost I'd like to thank you for the great work on this
wonderful
database.

I want to share three observations which show not expected behavior.

I'm running PostgreSQL 15.7 (Debian 15.7-1.pgdg120+1) on
x86_64-pc-linux-gnu,
compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit in a docker container on
Ubuntu 22.04.4 LTS.

# First observation

Trying to learn the effect of toast for jsonb I want to have a table with a
jsonb column. Since I want not to be distracted by the effect of
compression I
set the storage to external. I populate the test rows with values which a
slightly smaller than the threshold for compression.

CREATE TABLE toasttest (jb jsonb);
ALTER TABLE toasttest ALTER COLUMN jb SET STORAGE EXTERNAL;
INSERT INTO toasttest
SELECT
jsonb_build_object(
'id', i,
'foo', (select jsonb_agg(0) from generate_series(1, 1960/12)) -- [0,0,0,
...]
) jb
FROM
generate_series(1, 10000) i;

Checking the statistics to verify if expectations are met:

SELECT relname, relpages, reltuples
FROM pg_class
WHERE oid IN (
SELECT UNNEST(ARRAY[oid, reltoastrelid])
FROM pg_class
WHERE oid = 'toasttest'::regclass);

relname | relpages | reltuples
----------------+----------+-----------
toasttest | 2500 | 10000
pg_toast_56808 | 0 | -1

EXPLAIN (ANALYZE, BUFFERS) SELECT jb->'id' FROM toasttest;

QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Seq Scan on toasttest (cost=0.00..2625.00 rows=10000 width=32) (actual
time=0.015..4.736 rows=10000 loops=1)
Buffers: shared hit=2500
Planning:
Buffers: shared hit=5
Planning Time: 0.145 ms
Execution Time: 5.357 ms
(6 rows)

So far so good. Now I want to observe the effect of updating the jsonb
value. Since the storage has be set to external, I expect the jsonb value
to be
moved to the toast table.

UPDATE toasttest SET jb = jb || '{"bar": "baz"}';
VACUUM FULL toasttest;
ANALYZE toasttest;

The output for the statistics and explain plan after updating looks as
follows:

relname | relpages | reltuples
----------------+----------+-----------
toasttest | 64 | 10000
pg_toast_56808 | 3334 | 20000

QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Seq Scan on toasttest (cost=0.00..189.00 rows=10000 width=32) (actual
time=0.147..28.911 rows=10000 loops=1)
Buffers: shared hit=30048 read=32
Planning:
Buffers: shared hit=5
Planning Time: 0.115 ms
Execution Time: 29.339 ms

As you can observe the cost has decreased by an order of magnitude while the
buffers accessed during execution has increased by an order of magnitude.

I would have expected that the cost also should have increased. This would
be
expected because also the number of buffers have increased if we consider
the
toast table.

Only the number of pages for the main table have decreased and would
explain the
drop in the costs if only the main table would be considered.

Shouldn't the cost of the plan also reflect the cost of accessing pages in
the
toast table?

This was my first observation. Now lets move to the second.

# Second observation

If I now want to reproduce the behavior from above, I start with dropping
the
table.

DROP TABLE toasttest;

If I now investigate the relpages before and after the UPDATE, I observe the
following output:

CREATE TABLE
ALTER TABLE
INSERT 0 10000
ANALYZE

relname | relpages | reltuples
----------------+----------+-----------
toasttest | 2500 | 10000
pg_toast_66818 | 0 | -1
(2 rows)

UPDATE 10000
VACUUM
ANALYZE

relname | relpages | reltuples
----------------+----------+-----------
toasttest | 64 | 10000
pg_toast_66818 | 0 | -1
(2 rows)

So after dropping the table, recreating and analyzing it, I do not get
updated
values for relpages for the corresponding toast table. I would expect that
the
analyze command also updates the statistics for the toast table.

However, in some situations the values for the toast table appear after some
time, but I was not able yet to find out, when.

# Third observation

I tried to reproduce the behavior on the latest version: PostgreSQL 16.3
(Debian
16.3-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14)
12.2.0, 64-bit. Again running it as a docker image.

The observation regarding query planner cost could be reproduced with one
minor
difference. Even after fresh installation the first analyze after the
update
did NOT show the relpages for the toast table. The cost decreased after the
update as described above. As described in the second observation, the
values
of relpages for the toast table appeared some time later.
The decrease in cost for the execution plan stays even after the relpages
for the
toast table appeared after some time.

If anything is unclear in my description I would be happy to help.

Kind regards
Stefan Litsche

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Masahiko Sawada 2024-07-05 14:52:52 Re: Potential data loss due to race condition during logical replication slot creation
Previous Message Etsuro Fujita 2024-07-05 12:49:27 Re: BUG #17889: Invalid cursor direction for a foreign scan that reached the fetch_size (MOVE BACKWARD ALL IN cX)