Re: Logcal replication in large scale

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Nicklas Avén <nicklas(dot)aven(at)jordogskog(dot)no>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Logcal replication in large scale
Date: 2017-09-20 09:12:53
Message-ID: CABUevEwKjY2cE0ri5gpQ8vtPCaXvQBc97rZ6uVNEVdpufF=ngQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Sep 20, 2017 at 10:02 AM, Nicklas Avén <nicklas(dot)aven(at)jordogskog(dot)no>
wrote:

> On Wed, 2017-09-20 at 09:44 +0200, Magnus Hagander wrote:
>
> On Wed, Sep 20, 2017 at 8:53 AM, Nicklas Avén <nicklas(dot)aven(at)jordogskog(dot)no>
> wrote:
>
> Hallo all
>
> I am thrilled about logical replication in PostgreSQL 10. My head have
> started spinning about use cases.
>
> Would it be possible to use logical replication as a distribution
> method of data?
>
>
> As an answer to the generic question: yes :)
>
>
>
> I think about map data from national mapping authorities. The problem
> is to get the updates of their data sets. Especially the open data sets
> are now distributed as files (shape files) in Sweden and as pg_dump in
> Norway.
>
> I guess it is beyond what logical replication is designed for, so I ask
> what problems that might arise for a scenario like:
>
> The distributor has a publication database with logical replication
> publications of the tables. All users, probably thousands or more,
> would subscribe to that publication to get an updated copy of the data
> set.
>
> How would the publication server react? I guess the WAL-files will only
> be written once anyway?
>
>
> Yes. But they will need to be kept around until *all* subscribers have
> pulled down their changes. So even one subscriber falling behind will mean
> your WAL will never get cleaned up.
>
> Of course, you can keep some sort of watcher process that kills old
> replication slots.
>
> I am also not sure how well PostgreSQL will react to having thousands of
> replication slots. It's not what the system was designed for I believe :)
>
>
> Ok, I have to read me up on how this works. I thought about it as a bucket
> of WAL-files that the subscribers just "grab".
> But of course there is some bookkeeping to make things work.
>

No, it's streaming replication.

And in the end, that bucket becomes infinitely large.

>
> I guess there is also mechanisms so a new subscriber can get a complete
> table after the publcation have been active for a long time? I mean the
> "cleaning up" of Wal-files will not leave new subscribers missing what is
> written to the table long ago?
>

Yes, new subscriptions will get the current version of the data and only
then start buffering changes.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message bluefrog 2017-09-20 11:28:29 Re: random row from a subset
Previous Message Albe Laurenz 2017-09-20 08:45:37 Re: VM-Ware Backup of VM safe?