Re: Sample archive_command is still problematic

From: "MauMau" <maumau307(at)gmail(dot)com>
To: "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Kevin Grittner" <kgrittn(at)ymail(dot)com>, "Josh Berkus" <josh(at)agliodbs(dot)com>, <pgsql-docs(at)postgresql(dot)org>
Subject: Re: Sample archive_command is still problematic
Date: 2014-08-14 04:31:55
Message-ID: 35326E59461948B394A861F69C272795@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

From: "Peter Eisentraut" <peter_e(at)gmx(dot)net>
> I realize that there are about 128 different ways people set this up
> (which is itself a problem), but it appears to me that a solution like
> pg_copy only provides local copying, which implies the use of something
> like NFS. Which may be OK, but then we'd need to get into the details
> of how to set up NFS properly for this.

Yes, I think the flexibility of archive_command is nice. The problem I want
to address is that users don't have a simple way to realiably archive files
in very simple use cases -- local copying to local or network storage.
pg_copy is a low-level command to fill the gap.

> Also, I think you can get local copy+fsync with dd.

Yes, dd on Linux has "sync" option. But dd on Solaris doesn't. I can't
find a command on Windows which is installed by default.

> The alternatives of doing remote copying inside archive_command are also
> questionable if you have multiple standbys.

Yes, we may need another interface than archive_command for archiving files
to multiple locations. That's another issue.

> Basically, this whole interface is terrible. Maybe it's time to phase
> it out and start looking into pg_receivexlog.

pg_receivexlog seems difficult to me. Users have to start, stop, and
monitor pg_receivexlog. That's burdonsome. For example, how do we start
pg_receivexlog easily on Windows when the PostgreSQL is configured to
start/stop automatically on OS startup/shutdown with Windows service? In
addition, users have to be aware of connection slots (max_connections and
max_wal_senders) and replication slots.

pg_receivexlog impose extra overhead even on simple use cases. I want
backup-related facilities to use as less resources as possible. e.g., with
archive_command, the data flows like this:

disk -> OS cache -> copy command's buffer -> OS cache -> disk

OTOH, with pg_receivexlog:

disk -> OS cache -> walsender's buffer -> socket send buffer -> kernel
buffer? -> socket receive buffer -> pg_receivexlog's buffer -> OS cache ->
disk

For reference, \copy of psql is described like this:

Tip: This operation is not as efficient as the SQL COPY command because all
data must pass through the client/server connection. For large amounts of
data the SQL command might be preferable.

Regards
MauMau

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message Magnus Hagander 2014-08-16 19:02:09 Re: Sample archive_command is still problematic
Previous Message MauMau 2014-08-14 03:32:47 Re: Sample archive_command is still problematic