From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Fujii Masao <fujii(at)postgresql(dot)org>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Weird failure with latches in curculio on v15 |
Date: | 2023-02-19 14:36:24 |
Message-ID: | CA+TgmoaLCxrdHPSnRLD1j1FQQx4k7QSJq-ybVOYj2aEF34sQLQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Feb 19, 2023 at 2:45 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> To me that seems even simpler? Nothing but the archiver is supposed to create
> .done files and nothing is supposed to remove .ready files without archiver
> having created the .done files. So the archiver process can scan
> archive_status until its done or until N archives have been collected, and
> then process them at once? Only the creation of the .done files would be
> serial, but I don't think that's commonly a problem (and could be optimized as
> well, by creating multiple files and then fsyncing them in a second pass,
> avoiding N filesystem journal flushes).
>
> Maybe I am misunderstanding what you see as the problem?
Well right now the archiver process calls ArchiveFileCB when there's a
file ready for archiving, and that process is supposed to archive the
whole thing before returning. That pretty obviously seems to preclude
having more than one file being archived at the same time. What
callback structure do you have in mind to allow for that?
I mean, my idea was to basically just have one big callback:
ArchiverModuleMainLoopCB(). Which wouldn't return, or perhaps, would
only return when archiving was totally caught up and there was nothing
more to do right now. And then that callback could call functions like
AreThereAnyMoreFilesIShouldBeArchivingAndIfYesWhatIsTheNextOne(). So
it would call that function and it would find out about a file and
start an HTTP session or whatever and then call that function again
and start another HTTP session for the second file and so on until it
had as much concurrency as it wanted. And then when it hit the
concurrency limit, it would wait until at least one HTTP request
finished. At that point it would call
HeyEverybodyISuccessfullyArchivedAWalFile(), after which it could
again ask for the next file and start a request for that one and so on
and so forth.
I don't really understand what the other possible model is here,
honestly. Right now, control remains within the archive module for the
entire time that a file is being archived. If we generalize the model
to allow multiple files to be in the process of being archived at the
same time, the archive module is going to need to have control as long
as >= 1 of them are in progress, at least AFAICS. If you have some
other idea how it would work, please explain it to me...
--
Robert Haas
EDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | houzj.fnst@fujitsu.com | 2023-02-19 14:54:40 | RE: Support logical replication of DDLs |
Previous Message | Andrew Dunstan | 2023-02-19 13:58:59 | Re: Handle TEMP_CONFIG for pg_regress style tests in pg_regress.c |