Re: Some questions about PostgreSQL’s design.

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: 陈宗志 <baotiao(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Some questions about PostgreSQL’s design.
Date: 2024-08-20 13:46:54
Message-ID: 8400b24d-37d0-49ee-94c9-ba8709dcf9ab@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 20/08/2024 11:46, 陈宗志 wrote:
> I’ve recently started exploring PostgreSQL implementation. I used to
> be a MySQL InnoDB developer, and I find the PostgreSQL community feels
> a bit strange.
>
> There are some areas where they’ve done really well, but there are
> also some obvious issues that haven’t been improved.
>
> For example, the B-link tree implementation in PostgreSQL is
> particularly elegant, and the code is very clean.
> But there are some clear areas that could be improved but haven’t been
> addressed, like the double memory problem where the buffer pool and
> page cache store the same page, using full-page writes to deal with
> torn page writes instead of something like InnoDB’s double write
> buffer.
>
> It seems like these issues have clear solutions, such as using
> DirectIO like InnoDB instead of buffered IO, or using a double write
> buffer instead of relying on the full-page write approach.
> Can anyone replay why?

There are pros and cons. With direct I/O, you cannot take advantage of
the kernel page cache anymore, so it becomes important to tune
shared_buffers more precisely. That's a downside: the system requires
more tuning. For many applications, squeezing the last ounce of
performance just isn't that important. There are also scaling issues
with the Postgres buffer cache, which might need to be addressed first.

With double write buffering, there are also pros and cons. It also
requires careful tuning. And replaying WAL that contains full-page
images can be much faster, because you can write new page images
"blindly" without reading the old pages first. We have WAL prefetching
now, which alleviates that, but it's no panacea.

In summary, those are good solutions but they're not obviously better in
all circumstances.

> However, the PostgreSQL community’s mailing list is truly a treasure
> trove, where you can find really interesting discussions. For
> instance, this discussion on whether lock coupling is needed for
> B-link trees, etc.
> https://www.postgresql.org/message-id/flat/CALJbhHPiudj4usf6JF7wuCB81fB7SbNAeyG616k%2Bm9G0vffrYw%40mail.gmail.com

Yep, there are old threads and patches for double write buffers and
direct IO too :-).

--
Heikki Linnakangas
Neon (https://neon.tech)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2024-08-20 13:48:00 Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs
Previous Message Andrew Dunstan 2024-08-20 13:31:14 Re: why is pg_upgrade's regression run so slow?