Quick Links

Re: adding more space to the existing server

From:	Alban Hertroys <haramrae(at)gmail(dot)com>
To:	Julie Nishimura <juliezain(at)hotmail(dot)com>
Cc:	Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject:	Re: adding more space to the existing server
Date:	2019-08-05 12:01:24
Message-ID:	CAF-3MvPS2r2ZQdSFOTMW3LzfMZYKGB8J0XsuvW7Agj60xQ5PpA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On 5 Aug 2019, at 0:39, Julie Nishimura <juliezain(at)hotmail(dot)com> wrote:

Alban, thank you for your reply. Your suggestion makes sense, and I will be
talking to our engineers about it. Currently we need to understand:

a) How do we break A -> B replication such that both can become
independent primaries

That is pretty much like normal failover from A to B, except that you don’t
reverse replication.

You will need to stop your clients from sending data for a bit (if it’s
continuous data, having a buffer in between is a big help - at our company
we’re looking into Apache Kafka for that), so that you can switch half of
them to connect to B instead of A.

Next, you promote B to master. I used the docs for that last time, and they
were pretty clear on the subject.

b) How do we reassign C from B->C replication to A->C replication

I don’t think you need to. If you indeed already have A->B->C, after
promoting B to master, you end up with B->C, which is alright.
You just need to add A->D for the other set.

c) Thoughts on why this isn’t a good plan

That depends on your clients and how you decide which database in the
current cluster they connect to. If you connect specific clients to
specific databases, then all you need to do is to configure half your
clients to connect to B instead.

Another option is to put a virtual database layer in front, such that both
clusters still look like a single database to the outside world. We have
some experience with Dremio for similar purposes (although for read-only
reporting). Mind that the community edition doesn’t do authorization.

Current:
A replicates to B
all requests go to A

Soon:
A replicates to B -> cascading to C and D

Transition:
break A replication to B such that both can become primary

Correct.

stop B replication to C then setup A to replicate to C

I would change this in:
setup A to replicate to D

End state:
A replicates to C
B replicates to D

End state:
A replicates to D
B replicates to C

we remove some of the dbs from A and B, then reassign the traffic based on
db selections

I hope it all makes sense...

Thank you

It does to me. Now would be a good time for people to chime in if they
don't agree ;)

From: Alban Hertroys <haramrae(at)gmail(dot)com>
Sent: Saturday, August 3, 2019 3:15 AM
To: Julie Nishimura <juliezain(at)hotmail(dot)com>
Cc: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>;
pgsql-general(at)lists(dot)postgresql(dot)org <pgsql-general(at)lists(dot)postgresql(dot)org>;
pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: adding more space to the existing server

> On 2 Aug 2019, at 21:45, Julie Nishimura <juliezain(at)hotmail(dot)com> wrote:

> 1) We use streaming replication, and due to hardware limitation, we
cannot add more drives to the existing host. That is why we thought by
breaking the existing streaming replication (from a->b), instead of
currently identical standby (b), we can introduce twice larger host, then
start the replication to the newly larger host, and when it is caught up,
break it again. Then break rep again, make modification to 'a" host, making
it larger, then replicate b->a. After it is caught up, break the rep again,
switch master->standby (if necessary).

Let’s be clear, I’m in no way an expert on replication. In fact, I’ve only
done (streaming) replication once and I managed to mess that up in a minor
way (disabled the wrong service during failover, so data still went to
the database I was attempting to replace for, like, 15 minutes).

> 2) I am not sure about the time, but it is understood it is required 2
full replication cycles, and might be up to 2 weeks with no standby
situation

No standby situation? Murphy is probably just waiting for that to strike…
I recall a fairly recent story on the FreeBSD ML about someone on Malta
doing a migration of a couple dozen terabytes from her main server (because
of some failing disks in her RAID set) using her backup server to move data
around (with backups removed to make room), when, due to an accident
outside the building, an aerial 10KV power line hit another power line in
the ground, causing a fire in one UPS and frying the other one. Losing
power at that point meant that the file systems (ZFS) on both servers ended
up in an unrecoverable state with no backups. It didn’t help that the UPS’s
were at the bottom of the rack, with the heat and smoke going up into the
servers. What are the chances, right? (And then it turned out that it is
really hard to try to recover data from a ZFS file system in such a state,
which is what her actual inquiry was about)

I would definitely prefer to add a 3rd machine into the mix, even if it
were just a temporary machine - a rental perhaps?

From there, I’m certain Adrian knows more about replication than I do. I’d
go with the approach he suggested.

> 4) by pg_basebackup and restore
>
> As of now, we are thinking about possibly other solutions, as of
splitting existing 37 databases on the cluster into 2 hosts with their own
standbys. This solution requires breaking up existing replication as well.
Can you please point me to some document which lists all steps describing
breaking up the existing replication properly? we are using 9.6 postgres

I’m going to assume that you will have data coming in while this split is
taking place and that you therefore cannot offline the entire set of
databases for as long as this takes. If not, that would probably allow for
a simpler (faster) scenario.

I think the easiest for this scenario would be to add two more machines (c
and d) and replicate them off the current setup. You want that to happen as
parallel as possible, so perhaps replicate c off a and d off b.

If you aren’t already using “replication slots”, I found that to make
things both easier to understand and more reliable. You can query their
status, for one thing.

Those replicas will take extra time of course (about double) because you’re
replicating twice what you need, but I don’t think you can replicate parts
of a cluster with your setup unless you go for a different replication
approach (I think per database replication requires statement level
replication?).

After that, decouple both sets into:
a —> b (your current machine)
c —> d (the new ones)

(Although any order should be fine, really, as long as they have caught up.)

At that point I would probably (temporarily) pause replication in at least
one set and create a backup of that.

This is the point to start removing superfluous databases from a and c (so
that a+c make up the complete set again).
After verifying that no databases are missing, unpause replication.

If instead you find that you accidentally removed a database from both a
and c, you still have replicas to recover it from. And the backups, of
course, but that will not contain the data that came in after replication
was paused.

I do hope the remaining 3% disk space is enough to cover all that, though...

Regards,

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

Alban Hertroys
--
There is always an exception to always.

In response to

Re: adding more space to the existing server at 2019-08-04 22:39:13 from Julie Nishimura

Responses

Re: adding more space to the existing server at 2019-08-05 15:27:57 from Julie Nishimura

Browse pgsql-general by date

	From	Date	Subject
Next Message	PegoraroF10	2019-08-05 12:07:58	why toasted is created when domain is used ?
Previous Message	Imre Samu	2019-08-05 11:46:28	Re: Compression In Postgresql 9.6