From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Make relfile tombstone files conditional on WAL level |
Date: | 2021-09-30 10:32:20 |
Message-ID: | CA+hUKGLs554tQFCUjv_vn7ft9Xv5LNjPoAd--3Df+JJKJ7A8kw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Sep 29, 2021 at 4:07 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> Hmm, on closer inspection, isn't the lack of real interlocking with
> checkpoints a bit suspect already? What stops bgwriter from writing
> to the previous relfilenode generation's fd, if a relfilenode is
> recycled while BgBufferSync() is running? Not sinval, and not the
> above code that only runs between BgBufferSync() invocations.
I managed to produce a case where live data is written to an unlinked
file and lost, with a couple of tweaks to get the right timing and
simulate OID wraparound. See attached. If you run the following
commands repeatedly with shared_buffers=256kB and
bgwriter_lru_multiplier=10, you should see a number lower than 10,000
from the last query in some runs, depending on timing.
create extension if not exists chaos;
create extension if not exists pg_prewarm;
drop table if exists t1, t2;
checkpoint;
vacuum pg_class;
select clobber_next_oid(200000);
create table t1 as select 42 i from generate_series(1, 10000);
select pg_prewarm('t1'); -- fill buffer pool with t1
update t1 set i = i; -- dirty t1 buffers so bgwriter writes some
select pg_sleep(2); -- give bgwriter some time
drop table t1;
checkpoint;
vacuum pg_class;
select clobber_next_oid(200000);
create table t2 as select 0 i from generate_series(1, 10000);
select pg_prewarm('t2'); -- fill buffer pool with t2
update t2 set i = 1 where i = 0; -- dirty t2 buffers so bgwriter writes some
select pg_sleep(2); -- give bgwriter some time
select pg_prewarm('pg_attribute'); -- evict all clean t2 buffers
select sum(i) as t2_sum_should_be_10000 from t2; -- have any updates been lost?
Attachment | Content-Type | Size |
---|---|---|
0001-HACK-A-function-to-control-the-OID-allocator.patch | text/x-patch | 2.9 KB |
0002-HACK-Slow-the-bgwriter-down-a-bit.patch | text/x-patch | 1.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Fabien COELHO | 2021-09-30 10:36:47 | Re: rand48 replacement |
Previous Message | vignesh C | 2021-09-30 10:27:48 | Re: Added schema level support for publication. |