Re: Patch: incorrect array offset in backend replication tar header

From: Brian Weaver <cmdrclueless(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch: incorrect array offset in backend replication tar header
Date: 2012-09-25 13:38:22
Message-ID: CAAhXZGvax_EPVRA=_0oFKAx3DSfamwmspH8wUKsVzH3ASXfrdQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom,

I actually plan on doing a lot of work on the frontend pg_basebackup
for my employer. pg_basebackup is 90% of the way to a solution that I
need for doing backups of *large* databases while allowing the
database to continue to work. The problem is a lack of secondary disk
space to save a replication of the original database cluster. I want
to modify pg_basebackup to include the WAL files in the tar output. I
have several ideas but I need to code and test them. That was the main
reason I was examining the backend code.

If you're willing to wait a bit on me to code and test my extensions
to pg_basebackup I will try to address some of the deficiencies as
well add new features.

I agree the checksum algorithm could definitely use some refactoring.
I was already working on that before I retired last night.

-- Brian

On Mon, Sep 24, 2012 at 10:36 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Brian Weaver <cmdrclueless(at)gmail(dot)com> writes:
>> Here are lines 321 through 329 of 'archive_read_support_format_tar.c'
>> from libarchive
>
>> 321 /* Recognize POSIX formats. */
>> 322 if ((memcmp(header->magic, "ustar\0", 6) == 0)
>> 323 && (memcmp(header->version, "00", 2) == 0))
>> 324 bid += 56;
>> 325
>> 326 /* Recognize GNU tar format. */
>> 327 if ((memcmp(header->magic, "ustar ", 6) == 0)
>> 328 && (memcmp(header->version, " \0", 2) == 0))
>> 329 bid += 56;
>
>> I'm wondering if the original committer put the 'ustar00\0' string in by design?
>
> The second part of that looks to me like it matches "ustar \0",
> not "ustar00\0". I think the pg_dump coding is just wrong. I've
> already noticed that its code for writing the checksum is pretty
> brain-dead too :-(
>
> Note that according to the wikipedia page, tar programs typically
> accept files as pre-POSIX format if the checksum is okay, regardless of
> what is in the magic field; and the fields that were added by POSIX
> are noncritical so we'd likely never notice that they were being
> ignored. (In fact, looking closer, pg_dump isn't even filling those
> fields anyway, so the fact that it's not producing a compliant magic
> field may be a good thing ...)
>
> regards, tom lane

--

/* insert witty comment here */

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2012-09-25 13:49:14 Re: Switching timeline over streaming replication
Previous Message Andrew Dunstan 2012-09-25 13:32:22 Re: Oid registry