Re: Adding Support for Copy callback functionality on COPY TO api

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>, "Sanaba, Bilva" <bilvas(at)amazon(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Adding Support for Copy callback functionality on COPY TO api
Date: 2022-10-11 00:01:41
Message-ID: Y0SyZe081FFoHazf@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 10, 2022 at 09:38:59AM -0700, Nathan Bossart wrote:
> This new callback allows the use of COPY TO's machinery in extensions. A
> couple of generic use-cases are listed upthread [0], and one concrete
> use-case is the aws_s3 extension [1].

FWIW, I understand that the proposal is to have an easier control of
how, what and where to the data is processed. COPY TO PROGRAM
provides that with exactly the same kind of interface (data input, its
length) once you have a program able to process the data piped out the
same way. However, it is in the shape of an external process that
receives the data through a pipe hence it provides a much wider attack
surface which is something that all cloud provider care about. The
thing is that this allows extension developers to avoid arbitrary
commands on the backend as the OS user running the Postgres instance,
while still being able to process the data the way they want
(auditing, analytics, whatever) within the strict context of the
process running an extension code. I'd say that this is a very cheap
change to allow people to have more fun with the backend engine
(similar to the recent changes with archive libraries for
archive_command, but much less complex):
src/backend/commands/copy.c | 2 +-
src/backend/commands/copyto.c | 18 +++++++++++++++---
2 files changed, 16 insertions(+), 4 deletions(-)

(Not to mention that we've had our share of CVEs regarding COPY
PROGRAM even if it is superuser-only).

> I really doubt that this small test case is going to cause anything
> approaching undue maintenance burden. I think it's important to ensure
> this functionality continues to work as expected long into the future.

I like these toy modules, they provide test coverage while acting as a
template for new developers. I am wondering whether it should have
something for the copy from callback, actually, as it is named
"test_copy_callbacks" but I see no need to extend the module more than
necessary in the context of this thread (logical decoding uses it,
anyway).
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-10-11 00:06:39 Re: Adding Support for Copy callback functionality on COPY TO api
Previous Message Peter Geoghegan 2022-10-10 23:46:24 autovacuum_freeze_max_age reloption seems broken