From: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
---|---|
To: | Stephen Frost <sfrost(at)snowman(dot)net> |
Cc: | "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "ashutosh(dot)bapat(dot)oss(at)gmail(dot)com" <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Disable WAL logging to speed up data loading |
Date: | 2020-11-09 18:20:40 |
Message-ID: | CAKFQuwbunE8QONrbvWsV7Vk33bOccyPkOeG+-7_CkeA9g3GOSQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Nov 9, 2020 at 10:36 AM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * David G. Johnston (david(dot)g(dot)johnston(at)gmail(dot)com) wrote:
>
> > If the commit doesn't complete all of the newly created pages are junk.
> > Otherwise, you have a crash-recoverable state for those tables as regards
> > those specific pages.
>
> How would we track that and know which pages are junk?
>
Every one of those pages would have a single dead transaction id contained
within it. If there is bookkeeping that needs to happen that could be wal
logged - the goal here being not to avoid all wal but to avoid data wal. I
don't know enough about the internals here to be more specific.
> > Conceptually, we need an ability to perform a partial CHECKPOINT that
> names
> > specific tables, and make sure the crash-recovery works for those tables
> > while figuring out what amount of effort to expend on informing the dba
> and
> > alerting/preventing features that require wal from using those tables.
>
> Yeah, seems pretty complicated.
>
> Did you see an issue with the basic idea I proposed earlier, whereby an
> unlogged table could become 'logged', while we are at wal_level=minimal,
> by essentially checkpointing it (locking it, forcing out any buffers we
> have associated with it, and then fsync'ing it- not sure how much of
> that is already done in the unlogged->logged process but I would guess
> most of it) while not actually writing it into the WAL?
>
That is basically half of what is described above - the part at commit when
the relation is persisted to disk. What your earlier description seems to
be missing is the part about temporarily making a logged relation
unlogged. I envision that as being part of a transaction as opposed to a
permanent attribute of the table. I envision a storage parameter that
allows individual relations to be considered as having wal_level='minimal'
even if the system as a whole has, e.g., wal_level='replication'. Only
those could be forced into this temporarily unlogged mode.
One part I hadn't given thought to is indexes and how those would interact
with this plan. Mostly due to lack of internals knowledge.
David J.
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2020-11-09 18:47:39 | Re: RE: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare() |
Previous Message | Tomas Vondra | 2020-11-09 18:19:59 | Re: automatic analyze: readahead - add "IO read time" log message |