From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | tsukiwamoon(dot)pgsql(at)gmail(dot)com |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Performance improvement of WAL writing? |
Date: | 2019-08-28 08:17:40 |
Message-ID: | 20190828.171740.88050246.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi, Moon-san.
At Wed, 28 Aug 2019 15:43:02 +0900, "Moon, Insung" <tsukiwamoon(dot)pgsql(at)gmail(dot)com> wrote in <CAEMmqBvQ56-CvUy2kyJp0JJnFE9EAm4t0aGnSfiH+z4=Hw_6DQ(at)mail(dot)gmail(dot)com>
> Dear Hackers.
>
> Currently, the XLogWrite function is written in 8k(or 16kb) units
> regardless of the size of the new record.
> For example, even if a new record is only 300 bytes, pg_pwrite is
> called to write data in 8k units (if it cannot be writing on one page
> is 16kb written).
> Let's look at the worst case.
> 1) LogwrtResult.Flush is 8100 pos.
> 2) And the new record size is only 100 bytes.
> In this case, pg_pwrite is called which writes 16 kb to update only 100 bytes.
> It is a rare case, but I think there is overhead for pg_pwrite for some systems.
> # For systems that often update one record in one transaction.
If a commit is lonely in the system, the 100 bytes ought to be
flushed out immediately. If there are many concurrent commits,
XLogFlush() waits for all in-flight insertions up to the LSN the
backends is writing then actually flushes. Of course sometimes it
flushes just 100 bytes but sometimes it flushes far many bytes
involving records from other backends. If another backend has
flushed further than the backend's upto LSN, the backend skips
flush.
If you want to involve more commits in a flush, commit_delay lets
the backend wait for that duration so that more commits can come
in within the window. Or synchronous_commit = off works somewhat
more aggressively.
> So what about modifying the XLogWrite function only to write the size
> that should record?
If I understand your porposal correctly, you're proposing to
separate fsync from XLogWrite. Actually, as you proposed, roughly
speaking no flush happen execpt at segment switch or commit. If
you are proposing not flushing immediately even at commit,
commit_delay (or synchronous_commit) works that way.
> Can this idea benefit from WAL writing performance?
> If it's OK to improve, I want to do modification.
> How do you think of it?
So the proposal seems to be already achieved. If not, could you
elaborate the proposal, or explain about actual problem?
> Best Regards.
> Moon.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Julien Rouhaud | 2019-08-28 08:22:07 | Re: REINDEX filtering in the backend |
Previous Message | Fabien COELHO | 2019-08-28 07:50:44 | Re: refactoring - share str2*int64 functions |