From: | Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com> |
---|---|
To: | Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: block-level incremental backup |
Date: | 2019-07-11 11:30:22 |
Message-ID: | CAM2+6=X2A7FgYrKFxQCgEn+C2id21+HJGtyicby+R8LeNHFRvg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Anastasia,
On Wed, Jul 10, 2019 at 11:47 PM Anastasia Lubennikova <
a(dot)lubennikova(at)postgrespro(dot)ru> wrote:
> 23.04.2019 14:08, Anastasia Lubennikova wrote:
> > I'm volunteering to write a draft patch or, more likely, set of
> > patches, which
> > will allow us to discuss the subject in more detail.
> > And to do that I wish we agree on the API and data format (at least
> > broadly).
> > Looking forward to hearing your thoughts.
>
> Though the previous discussion stalled,
> I still hope that we could agree on basic points such as a map file
> format and protocol extension,
> which is necessary to start implementing the feature.
>
It's great that you too come up with the PoC patch. I didn't look at your
changes in much details but we at EnterpriseDB too working on this feature
and started implementing it.
Attached series of patches I had so far... (which needed further
optimization and adjustments though)
Here is the overall design (as proposed by Robert) we are trying to
implement:
1. Extend the BASE_BACKUP command that can be used with replication
connections. Add a new [ LSN 'lsn' ] option.
2. Extend pg_basebackup with a new --lsn=LSN option that causes it to send
the option added to the server in #1.
Here are the implementation details when we have a valid LSN
sendFile() in basebackup.c is the function which mostly does the thing for
us. If the filename looks like a relation file, then we'll need to consider
sending only a partial file. The way to do that is probably:
A. Read the whole file into memory.
B. Check the LSN of each block. Build a bitmap indicating which blocks have
an LSN greater than or equal to the threshold LSN.
C. If more than 90% of the bits in the bitmap are set, send the whole file
just as if this were a full backup. This 90% is a constant now; we might
make it a GUC later.
D. Otherwise, send a file with .partial added to the name. The .partial
file contains an indication of which blocks were changed at the beginning,
followed by the data blocks. It also includes a checksum/CRC.
Currently, a .partial file format looks like:
- start with a 4-byte magic number
- then store a 4-byte CRC covering the header
- then a 4-byte count of the number of blocks included in the file
- then the block numbers, each as a 4-byte quantity
- then the data blocks
We are also working on combining these incremental back-ups with the full
backup and for that, we are planning to add a new utility called
pg_combinebackup. Will post the details on that later once we have on the
same page for taking backup.
Thanks
--
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
Attachment | Content-Type | Size |
---|---|---|
IncrBackup.tar.gz | application/gzip | 10.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2019-07-11 11:38:19 | Re: Index Skip Scan |
Previous Message | Sergei Kornilov | 2019-07-11 11:12:36 | Re: base backup client as auxiliary backend process |