Re: Two proposed modifications to the PostgreSQL FDW

From: Chris Travers <chris(dot)travers(at)adjust(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Two proposed modifications to the PostgreSQL FDW
Date: 2018-08-21 08:49:29
Message-ID: CAN-RpxBYOaQS_5jd3V8xM3Qg8bDZXXkKXKz5E7AssJShhw6S2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 20, 2018 at 4:44 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Chris Travers <chris(dot)travers(at)adjust(dot)com> writes:
> > I am looking at trying to make two modifications to the PostgreSQL FDW
> and
> > would like feedback on this before I do.
>
> > 1. INSERTMETHOD=[insert|copy] option on foreign table.
>
> > One significant limitation of the PostgreSQL FDW is that it does a
> prepared
> > statement insert on each row written which imposes a per-row latency.
> This
> > hits environments where there is significant latency or few latency
> > guarantees particularly hard, for example, writing to a foreign table
> that
> > might be physically located on another continent. The idea is that
> > INSERTMETHOD would default to insert and therefore have no changes but
> > where needed people could specify COPY which would stream the data out.
> > Updates would still be unaffected.
>
> It seems unlikely to me that an FDW option would be at all convenient
> for this. What about selecting it dynamically based on the planner's
> estimate of the number of rows to be inserted?
>
> A different thing we could think about is enabling COPY TO/FROM a
> foreign table.
>

Actually as I start to understand some aspects Andres's concern above,
there are issues beyond numbers of rows. But yes, selecting dynamically
would be preferable.

Two major things I think we cannot support on this are RETURNING clauses
and ON CONFLICT clauses. So anywhere we need to worry about those a copy
node could not be used. So it is more complex than merely row estimates.

>
> > 2. TWOPHASECOMMIT=[off|on] option
>
> > The second major issue that I see with PostgreSQL's foreign database
> > wrappers is the fact that there is no two phase commit which means that a
> > single transaction writing to a group of tables has no expectation that
> all
> > backends will commit or rollback together. With this patch an option
> would
> > be applied to foreign tables such that they could be set to use two phase
> > commit When this is done, the first write to each backend would
> register a
> > connection with a global transaction handler and a pre-commit and commit
> > hooks would be set up to properly process these.
>
> ENOINFRASTRUCTURE ... and the FDW pieces of that hardly seem like the
> place to start.
>

I disagree about the lack of infrastructure. We have every piece of
infrastructure we need for a minimum viable offering.
1. Two Phase Commit in PostgreSQL
2. Custom Background Workers
3. Pre/Post Commit/Rollback hooks for callbacks.

Those are sufficient to handle the vast majority of error cases.

The one thing we *might* want that we don't have is a startup process to
scan a directory of background worker status files and fire off appropriate
background workers on database start. That hardly seems difficult though.

>
> regards, tom lane
>

--
Best Regards,
Chris Travers
Head of Database

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-08-21 08:59:48 Re: Pluggable Storage - Andres's take
Previous Message Chris Travers 2018-08-21 08:35:38 Re: Two proposed modifications to the PostgreSQL FDW