Quick Links

Re: BUG #18009: Postgres Recovery not happening

From:	Vamshikrishna T <tvk1271(at)gmail(dot)com>
To:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc:	pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject:	Re: BUG #18009: Postgres Recovery not happening
Date:	2023-07-10 11:27:27
Message-ID:	CA+t6Qsnj3W+4zCY3QwayLGUhkXGRSwgbTZox3KXJEg-kajjdHQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

Hi Thomas,

Thank you for the confirmation on fdatasync, Definitely this is not related
to regular AIX users, so we may not need any change.
Other thing i am interested in is, Looks like Postgres 15.2 doesn't support
Direct or Concurrent I/O on AIX ?. I don't see any
setting where i can tune that ?. I feel DIO or CIO would be more safer
than cached I/O. I see fsync error behaviors are
pretty interesting across different OS. Thanks for the info.

Thanks
Vamshi.

On Wed, 5 Jul 2023 at 09:06, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:

> On Wed, Jul 5, 2023 at 1:48 AM Vamshikrishna T <tvk1271(at)gmail(dot)com> wrote:
> > Thank you for your immediate response, This is not on the default file
> system of AIX ( JFS2 ), but on a specific special purpose file system.
> looks like ( open_datasync ) O_DSYNC is causing the issue which seems to
> be not honoured on this file system. Yeah abrupt shutdown can be treated as
> power loss.
>
> Sounds like a fun project. Is this alien technology that regular AIX
> users won't run into and I should forget this conversation ever
> happened, or is it a clue we should use wal_sync_method=fdatasync by
> default on AIX?
>
> > I used wal_sync_method=fdatasync, ( although i am not sure the problem
> vanished or not, because it is not getting reproduced ) but what i observe
> > is there is an immediate explicit sync calls to the files present in
> ../pg_wal/ directory post write call completions.
> >
> > 2023-07-04 03:36:04.259 CDT|64a3d9b4.8701bc|LOG: checkpoint strting:
> time
> > 2023-07-04 03:36:06.263 CDT|64a3d9b4.8701bc|LOG: checkpoint complete:
> wrote 21 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled;
> write=1.925 s, sync=0.053 s, total=2.005 s; sync files=20, longest=0.019 s,
> average=0.003 s; distance=32 kB, estimate=32 kB
> >
> > I can see between these two time interval, write caches are cleared.
> With wal_sync_method=fdatasync tunable, Is it safe to assume all the
> Postgres DB writes during checkpointing are called via explicit call to OS
> level sync() or fsync(), irrespective of O_DSYNC during open?.
>
> Yes. Our periodic checkpoints write out all the relation data (files
> like base/1234/2345 that hold tables and indexes), and then always
> call fsync() (sometimes the pwrite() calls and the fsync() happen in
> different processes*). But WAL data (files like
> pg_wal/000000010000000000000001) get the various wal_sync_method
> behaviours, with (IMHO unfortunately) different defaults based on a
> series of inconsistent platform-by-platform historical decisions...
>
> Just BTW, if you're interested in PostgreSQL on AIX, see
> https://wiki.postgresql.org/wiki/AIX .
>
> *With interesting cross-platform consequences. If you're a
> kernel/VMM/storage person you might find
> https://wiki.postgresql.org/wiki/Fsync_Errors interesting. Of course
> we have no idea what any closed source kernel does.
>

--
Thanks
Vamshi.

In response to

Re: BUG #18009: Postgres Recovery not happening at 2023-07-05 03:35:28 from Thomas Munro

Responses

Re: BUG #18009: Postgres Recovery not happening at 2023-07-10 23:52:23 from Thomas Munro

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Gurjeet Singh	2023-07-10 16:35:05	Re: BUG #18016: REINDEX TABLE failure
Previous Message	Richard Guo	2023-07-10 06:39:27	Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower