From: | David Zhang <david(dot)zhang(at)highgo(dot)ca> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ahsan Hadi <ahsan(dot)hadi(at)gmail(dot)com> |
Cc: | Asif Rehman <asifr(dot)rehman(at)gmail(dot)com>, Kashif Zeeshan <kashif(dot)zeeshan(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: WIP/PoC for parallel backup |
Date: | 2020-04-27 16:53:16 |
Message-ID: | 0bad90f5-b2c3-6f62-57da-f88b003f8570@highgo.ca |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Here is the parallel backup performance test results with and without
the patch "parallel_backup_v15" on AWS cloud environment. Two
"t2.xlarge" machines were used: one for Postgres server and the other
one for pg_basebackup with the same machine configuration showing below.
Machine configuration:
Instance Type :t2.xlarge
Volume type :io1
Memory (MiB) :16GB
vCPU # :4
Architecture :x86_64
IOP :6000
Database Size (GB) :108
Performance test results:
without patch:
real 18m49.346s
user 1m24.178s
sys 7m2.966s
1 worker with patch:
real 18m43.201s
user 1m55.787s
sys 7m24.724s
2 worker with patch:
real 18m47.373s
user 2m22.970s
sys 11m23.891s
4 worker with patch:
real 18m46.878s
user 2m26.791s
sys 13m14.716s
As required, I didn't have the pgbench running in parallel like we did
in the previous benchmark.
The perf report files for both Postgres server and pg_basebackup sides
are attached.
The files are listed like below. i.e. without patch 1 worker, and with
patch 1, 2, 4 workers.
perf report on Postgres server side:
perf.data-postgres-without-parallel_backup_v15.txt
perf.data-postgres-with-parallel_backup_v15-j1.txt
perf.data-postgres-with-parallel_backup_v15-j2.txt
perf.data-postgres-with-parallel_backup_v15-j4.txt
perf report on pg_basebackup side:
perf.data-pg_basebackup-without-parallel_backup_v15.txt
perf.data-pg_basebackup-with-parallel_backup_v15-j1.txt
perf.data-pg_basebackup-with-parallel_backup_v15-j2.txt
perf.data-pg_basebackup-with-parallel_backup_v15-j4.txt
If any more information required please let me know.
On 2020-04-21 7:12 a.m., Amit Kapila wrote:
> On Tue, Apr 21, 2020 at 5:26 PM Ahsan Hadi <ahsan(dot)hadi(at)gmail(dot)com> wrote:
>> On Tue, Apr 21, 2020 at 4:50 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>> On Tue, Apr 21, 2020 at 5:18 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>>> On Tue, Apr 21, 2020 at 1:00 PM Asif Rehman <asifr(dot)rehman(at)gmail(dot)com> wrote:
>>>>> I did some tests a while back, and here are the results. The tests were done to simulate
>>>>> a live database environment using pgbench.
>>>>>
>>>>> machine configuration used for this test:
>>>>> Instance Type: t2.xlarge
>>>>> Volume Type : io1
>>>>> Memory (MiB) : 16384
>>>>> vCPU # : 4
>>>>> Architecture : X86_64
>>>>> IOP : 16000
>>>>> Database Size (GB) : 102
>>>>>
>>>>> The setup consist of 3 machines.
>>>>> - one for database instances
>>>>> - one for pg_basebackup client and
>>>>> - one for pgbench with some parallel workers, simulating SELECT loads.
>>>>>
>>>>> basebackup | 4 workers | 8 Workers | 16 workers
>>>>> Backup Duration(Min): 69.25 | 20.44 | 19.86 | 20.15
>>>>> (pgbench running with 50 parallel client simulating SELECT load)
>>>>>
>>>>> Backup Duration(Min): 154.75 | 49.28 | 45.27 | 20.35
>>>>> (pgbench running with 100 parallel client simulating SELECT load)
>>>>>
>>>> Thanks for sharing the results, these show nice speedup! However, I
>>>> think we should try to find what exactly causes this speed up. If you
>>>> see the recent discussion on another thread related to this topic,
>>>> Andres, pointed out that he doesn't think that we can gain much by
>>>> having multiple connections[1]. It might be due to some internal
>>>> limitations (like small buffers) [2] due to which we are seeing these
>>>> speedups. It might help if you can share the perf reports of the
>>>> server-side and pg_basebackup side.
>>>>
>>> Just to be clear, we need perf reports both with and without patch-set.
>>
>> These tests were done a while back, I think it would be good to run the benchmark again with the latest patches of parallel backup and share the results and perf reports.
>>
> Sounds good. I think we should also try to run the test with 1 worker
> as well. The reason it will be good to see the results with 1 worker
> is that we can know if the technique to send file by file as is done
> in this patch is better or worse than the current HEAD code. So, it
> will be good to see the results of an unpatched code, 1 worker, 2
> workers, 4 workers, etc.
>
--
David
Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca
Attachment | Content-Type | Size |
---|---|---|
perf-report-parallel_backup_v15.zip | application/zip | 24.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2020-04-27 17:14:11 | Re: proposal - plpgsql - all plpgsql auto variables should be constant |
Previous Message | Jonah H. Harris | 2020-04-27 16:52:48 | Proposing WITH ITERATIVE |