Re: Warn when parallel restoring a custom dump without data offsets

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: David Gilman <davidgilman1(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Warn when parallel restoring a custom dump without data offsets
Date: 2020-05-20 03:26:57
Message-ID: CAOaQA5xsa7KTsX3js5mt00Do+AiarQb5qd4ZYM3AR_NwXtFzeg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I started fooling with this at home while our ISP is broke (pardon my brevity).

Maybe you also saw commit b779ea8a9a2dc3a089b3ac152b1ec4568bfeb26f
"Fix pg_restore so parallel restore doesn't fail when the input file
doesn't contain data offsets (which it won't, if pg_dump thought its
output wasn't seekable)..."

...which I guess should actually say "doesn't NECESSARILY fail", since
it also adds this comment:
"This could fail if we are asked to restore items out-of-order."

So this is a known issue and not a regression. I think the PG11
commit you mentioned (548e5097) happens to make some databases fail in
parallel restore that previously worked (I didn't check). Possibly
also some databases (or some pre-existing dumps) which used to fail
might possibly now succeed.

Your patch adds a warning if unseekable output might fail during
parallel restore. I'm not opposed to that, but can we just make
pg_restore work in that case? If the input is unseekable, then we can
never do a parallel restore at all. If it *is* seekable, could we
make _PrintTocData rewind if it gets to EOF using ftello(SEEK_SET, 0)
and re-scan again from the beginning? Would you want to try that ?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2020-05-20 04:15:40 Re: Trouble with hashagg spill I/O pattern and costing
Previous Message Thomas Munro 2020-05-20 03:08:24 Re: Parallel Seq Scan vs kernel read ahead