Re: [PoC PATCH] Parallel dump to /dev/null

From: Michael Banck <michael(dot)banck(at)credativ(dot)de>
To: Stephen Frost <sfrost(at)snowman(dot)net>, Christoph Berg <christoph(dot)berg(at)credativ(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andreas 'ads' Scherbaum <adsmail(at)wars-nicht(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PoC PATCH] Parallel dump to /dev/null
Date: 2018-03-21 08:07:41
Message-ID: 1521619661.15036.1.camel@credativ.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Am Dienstag, den 20.03.2018, 19:19 -0400 schrieb Stephen Frost:
> * Christoph Berg (christoph(dot)berg(at)credativ(dot)de) wrote:
> > Re: Tom Lane 2018-03-20 <12960(dot)1521557852(at)sss(dot)pgh(dot)pa(dot)us>
> > > It might help if the patch were less enthusiastic about trying to
> > > "optimize" by avoiding extra file opens/closes in the case of output
> > > to /dev/null. That seems to account for a lot of the additional
> > > complication, and I can't see a reason that it'd be worth it.
> >
> > Note that the last patch was just a PoC to check if the extra
> > open/close could be avoided. The "real" patch is the 2nd last.
>
> Even so, I'm really not a fan of this patch either. If we could do this
> in a general way where we supported parallel mode with output to stdout
> or to a file and then that file could happen to be /dev/null, I'd be
> more interested because it's at least reasonable for someone to want
> that beyond using pg_dump to (poorly) check for corruption.

What you are saying is you want parallel Dump for the custom format, not
just the directory format. Sure, I want that too, but that is not what
this patch is about, it does not change the custom format in any way.

But I agree that this would be a useful feature.

> As it is, this is an extremely special case which may even end up being
> confusing for users (I can run a parallel pg_dump to /dev/null, but not
> to a regular file?!).

This patch allows to treat the NUL device as a directory (which it
normally isn't) in the directory format. So it addresses the "I can
'pg_dump -Fc -f /dev/null', but not 'pg_dump -Fd -f /dev/null'?!"
question. 

That it allows for a parallel dump to /dev/null is merely a side-effect
of this, so you could just as well name the patch "Support dumping to
/dev/null in directory format as well".

I honestly didn't know before I wrote that patch whether 'pg_dump -Fd -f
/dev/null' might not just work and the error message you get ('could not
create directory "/dev/null": File exists') is a bit meh (but logical
from the point of view of pg_dump, of course).

> Instead of trying to use pg_dump for this, we should provide a way to
> actually check for corruption across everything (instead of just the
> heap..), and have all detected corruption reported in one pass.

Well, I agree that we should provide a more general tool as well.

> That'll take another release to do, but hopefully pushing back on this
> will encourage that to happen, whereas allowing this in would actively
> discourage someone from writing a proper tool and we would be much
> worse off for that.

To be honest, I don't buy that argument. Having had log shipping and
warm standbys did not eventually stop us from implementing proper
streaming replication. I would of course welcome the above tool, but I
probably won't work on that for v12.

I think what will happen is that everybody interested is just continuing
to dump to /dev/null sequentially, or use external hacks with ram disks,
nullfs FUSE drivers or scripts like https://github.com/cybertec-postgres
ql/scripts/blob/master/quick_verify.py .

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael(dot)banck(at)credativ(dot)de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Arthur Zakirov 2018-03-21 09:00:52 Re: [PROPOSAL] Shared Ispell dictionaries
Previous Message Michael Paquier 2018-03-21 07:40:11 Re: PQHost() undefined behavior if connecting string contains both host and hostaddr types