From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, Takashi Menjo <takashi(dot)menjo(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Cc: | Takashi Menjo <takashi(dot)menjou(dot)vg(at)hco(dot)ntt(dot)co(dot)jp>, "Deng, Gang" <gang(dot)deng(at)intel(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PoC] Non-volatile WAL buffer |
Date: | 2020-11-25 01:44:55 |
Message-ID: | de05bcb4-0441-84c1-8eaf-45beefad1d67@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 11/25/20 1:27 AM, tsunakawa(dot)takay(at)fujitsu(dot)com wrote:
> From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
>> It's interesting that they only place the tail of the log on PMEM,
>> i.e. the PMEM buffer has limited size, and the rest of the log is
>> not on PMEM. It's a bit as if we inserted a PMEM buffer between our
>> wal buffers and the WAL segments, and kept the WAL segments on
>> regular storage. That could work, but I'd bet they did that because
>> at that time the NV devices were much smaller, and placing the
>> whole log on PMEM was not quite possible. So it might be
>> unnecessarily complicated, considering the PMEM device capacity is
>> much higher now.
>>
>> So I'd suggest we simply try this:
>>
>> clients -> buffers (DRAM) -> wal segments (PMEM)
>>
>> I plan to do some hacking and maybe hack together some simple tools
>> to benchmarks various approaches.
>
> I'm in favor of your approach. Yes, Intel PMEM were available in
> 128/256/512 GB when I checked last year. That's more than enough to
> place all WAL segments, so a small PMEM wal buffer is not necessary.
> I'm excited to see Postgres gain more power.
>
Cool. FWIW I'm not 100% sure it's the right approach, but I think it's
worth testing. In the worst case we'll discover that this architecture
does not allow fully leveraging PMEM benefits, or maybe it won't work
for some other reason and the approach proposed here will work better.
Let's play a bit and we'll see.
I have hacked a very simple patch doing this (essentially replacing
open/write/close calls in xlog.c with pmem calls). It's a bit rough but
seems good enough for testing/experimenting. I'll polish it a bit, do
some benchmarks, and share some numbers in a day or two.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2020-11-25 01:46:12 | Re: Keep elog(ERROR) and ereport(ERROR) calls in the cold path |
Previous Message | Tom Lane | 2020-11-25 01:40:06 | Re: About adding a new filed to a struct in primnodes.h |