From: | Dmitry Shulga <d(dot)shulga(at)postgrespro(dot)ru> |
---|---|
To: | sfrost(at)snowman(dot)net |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Reduce the time required for a database recovery from archive. |
Date: | 2020-10-19 07:58:15 |
Message-ID: | 54014787-2A2A-43C1-BB66-5B21B2A05FDE@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello Stephen
> On 9 Sep 2020, at 21:26, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>
> Greetings,
>
> * Dmitry Shulga (d(dot)shulga(at)postgrespro(dot)ru) wrote:
>> Overall archive file processing is done one by one, and this might
>> create a performance bottleneck if archived WAL files are delivered slowly,
>> because the database server has to wait for arrival of the next
>> WAL segment before applying its records.
>>
>> To address this issue it is proposed to receive archived WAL files in parallel
>> so that when the next WAL segment file is required for processing of redo log
>> records it would be already available.
>
> Yes, pgbackrest already does exactly this (if configured)- uses parallel
> processes to fetch the WAL and have it be available ahead of time.
pgbackrest is a third-party software that should be additionally installed on customer's premises.
On the other hand, built-in support of this optimization in PostgresSQL is a good argument to add
this feature and provide it to customers just out of the box.
>
>> Implementation of this approach assumes running several background processes (bgworkers)
>> each of which runs a shell command specified by the parameter restore_command
>> to deliver an archived WAL file. Number of running parallel processes is limited
>> by the new parameter max_restore_command_workers. If this parameter has value 0
>> then WAL files delivery is performed using the original algorithm, that is in
>> one-by-one manner. If this parameter has value greater than 0 then the database
>> server starts several bgworker processes up to the limit specified by
>> the parameter max_restore_command_workers and passes to every process
>> WAL file name to deliver. Active processes start prefetching of specified
>> WAL files and store received files in the directory pg_wal/pgsql_tmp. After
>> bgworker process finishes receiving a file it marks itself as a free process
>> and waits for a new request to receive a next WAL file. The main process
>> performing database recovery still handles WAL files in one-by-one manner,
>> but instead of waiting for a next required WAL file's availability it checks for
>> that file in the prefetched directory. If a new file is present there,
>> the main process starts its processing.
>
> I'm a bit confused about this description- surely it makes sense for the
OK. The description I originally provided was probably pretty misleading so I will try to clarify it a bit.
So, as soon as a bgworker process finishes delivering a WAL file it marks itself as a free.
WAL records applier working in parallel and processing the WAL files in sequential manner.
Once it finishes handling of the current WAL file, it checks whether it is possible to run extra bgworker processes
to deliver WAL files which will be required a bit later. If there are free bgworker processes then applier requests
to start downloading of one or more extra WAL files. After that applier determines a name of next WAL file to handle
and checks whether it exist in the prefetching directory. If it does exist then applier starts handling it and
processing loop is repeated.
> parallel workers to continue to loop and fetch up to some specified
> max..? Then to monitor and to fetch more when the amount pre-fetched so
> far drops before that level? The description above makes it sound like
> X WAL will be fetched ahead of time, and then the recovery process will
> go through those until it runs out and then it'll have to wait for the
> next X WAL to be fetched, which means it's still going to end up being
> delayed even with these parallel processes, which isn't good.
>
> Does this also properly handle timeline switches..?
>
> Thanks,
>
> Stephen
Regards,
Dmitry
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2020-10-19 08:01:14 | Hash support for row types |
Previous Message | Michael Paquier | 2020-10-19 07:52:48 | Re: Online checksums verification in the backend |