Re: Add a GUC check hook to ensure summarize_wal cannot be enabled when wal_level is minimal

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Add a GUC check hook to ensure summarize_wal cannot be enabled when wal_level is minimal
Date: 2024-07-10 05:56:13
Message-ID: 7c1b48d8-0f08-4733-a4e6-d55f04581db7@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2024/07/10 2:57, Robert Haas wrote:
> Here's v2.

Thanks for the patch!

With v2 patch, I found a case where WAL data generated with
wal_level = minimal is summarized. In the steps below,
the hoge2_minimal table is created under wal_level = minimal,
but its block modification information is included in
the WAL summary files. I confirmed this by checking
the contents of WAL summary files using pg_wal_summary_contents().

Additionally, the hoge3_replica table created under
wal_level = replica is not summarized.

-------------------------------------------
initdb -D data
echo "wal_keep_size = '16GB'" >> data/postgresql.conf

pg_ctl -D data start
psql <<EOF
SHOW wal_level;
CREATE TABLE hoge1_replica AS SELECT n FROM generate_series(1, 100) n;
ALTER SYSTEM SET max_wal_senders TO 0;
ALTER SYSTEM SET wal_level TO 'minimal';
EOF

pg_ctl -D data restart
psql <<EOF
SHOW wal_level;
CREATE TABLE hoge2_minimal AS SELECT n FROM generate_series(1, 100) n;
ALTER SYSTEM SET wal_level TO 'replica';
EOF

pg_ctl -D data restart
psql <<EOF
SHOW wal_level;
CREATE TABLE hoge3_replica AS SELECT n FROM generate_series(1, 100) n;
CHECKPOINT;
CREATE TABLE hoge4_replica AS SELECT n FROM generate_series(1, 100) n;
CHECKPOINT;
ALTER SYSTEM SET summarize_wal TO on;
SELECT pg_reload_conf();
SELECT pg_sleep(5);
SELECT wsc.*, c.relname FROM pg_available_wal_summaries() JOIN LATERAL pg_wal_summary_contents(tli, start_lsn, end_lsn) wsc ON true JOIN pg_class c ON wsc.relfilenode = c.relfilenode WHERE c.relname LIKE 'hoge%' ORDER BY c.relname;
EOF
-------------------------------------------

I believe this issue occurs when the server is shut down cleanly.
The shutdown-checkpoint record retains the wal_level value used
before the shutdown. If wal_level is changed after this,
the wal_level that indicated by the shutdown-checkpoint record
and that the WAL data generated afterwards depends on may not match.

I'm sure this patch is necessary as a safeguard for WAL summarization.
OTOH, I also think we should apply the patch I proposed earlier
in this thread, which prevents summarize_wal from being enabled
when wal_level is set to minimal. This way, if there's
a misconfiguration, users will see an error message and
can quickly identify and fix the issue. Thought?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-07-10 06:02:34 Re: relfilenode statistics
Previous Message Junwang Zhao 2024-07-10 05:48:17 Re: jsonpath: Inconsistency of timestamp_tz() Output