From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Asif Rehman <asifr(dot)rehman(at)gmail(dot)com> |
Cc: | dipesh(dot)pandit(at)gmail(dot)com, Kashif Zeeshan <kashif(dot)zeeshan(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: WIP/PoC for parallel backup |
Date: | 2020-04-22 16:27:35 |
Message-ID: | CA+TgmobuV502WTiHbyefjGNfjBmujF_8uU_M-rukvp7g-wEt9Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Apr 22, 2020 at 10:18 AM Asif Rehman <asifr(dot)rehman(at)gmail(dot)com> wrote:
> I don't foresee memory to be a challenge here. Assuming a database containing 10240
> relation files (that max reach to 10 TB of size), the list will occupy approximately 102MB
> of space in memory. This obviously can be reduced, but it doesn’t seem too bad either.
> One way of doing it is by fetching a smaller set of files and clients can result in the next
> set if the current one is processed; perhaps fetch initially per table space and request for
> next one once the current one is done with.
The more concerning case is when someone has a lot of small files.
> Okay have added throttling_counter as atomic. however a lock is still required
> for throttling_counter%=throttling_sample.
Well, if you can't get rid of the lock, using a atomics is pointless.
>> + sendFile(file, file + basepathlen, &statbuf,
>> true, InvalidOid, NULL, NULL);
>>
>> Maybe I'm misunderstanding, but this looks like it's going to write a
>> tar header, even though we're not writing a tarfile.
>
> sendFile() always sends files with tar header included, even if the backup mode
> is plain. pg_basebackup also expects the same. That's the current behavior of
> the system.
>
> Otherwise, we will have to duplicate this function which would be doing the pretty
> much same thing, except the tar header.
Well, as I said before, the solution to that problem is refactoring,
not crummy interfaces. You're never going to persuade any committer
who understands what that code actually does to commit it.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2020-04-22 16:28:38 | Re: backup manifests |
Previous Message | Peter Geoghegan | 2020-04-22 16:22:31 | Re: Concurrency bug in amcheck |