Re: Rearchitecting for storage

From: Matthew Pounsett <matt(at)conundrum(dot)com>
To: Luca Ferrari <fluca1978(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Rearchitecting for storage
Date: 2019-07-19 14:41:31
Message-ID: CAAiTEH8SbbhdXcpNuu6kZzP7kJZKL_dFUcJ=TxBB_VDopzmKhg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, 19 Jul 2019 at 04:21, Luca Ferrari <fluca1978(at)gmail(dot)com> wrote:

>
> This could be trivial, but any chance you can partition the table
> and/or archive unused records (at least temporarly)? A 18 TB table
> quite frankly sounds a good candidate to contain records no one is
> interested in the near future.
>

Partitioning is a possibility. The whole database is historical test
results, stored specifically for doing comparisons over time, so I'm not
sure we can actually archive anything. Expiring old test data is a
discussion we might have to have, eventually.

> In any case, if you can partition the table chances are you can at
> least do a per-table backup that could simplify maintanance of the
> database.
>

My current backup plan for this database is on-site replication, and a
monthly pg_dump from the standby to be copied off-site. Doing per-table
backups sounds like a great way to end up with an inconsistent backup, but
perhaps I misunderstand what you mean.

Another possibility is putting the server into backup mode and taking a
snapshot of the filesystem, but coordinating that across chassis (in the
case where partitioning is used) in such a way that the db is consistent
sounds like a hard problem... unless issuing pg_start_backup on the chassis
holding the master table coordinates backup mode on all the chassis holding
child tables at the same time? I haven't read enough on that yet.

>
> In desperate order, I would check also the log files (I mean, textual
> logs, not wals) because occasionally I found them requiring a few GBs
> on my disk, and that can be easily archived to gain some more extra
> space.
> Then I would go for some commodity NAS to attach as extra storage, at
> least for the upgrade process.
>

Okay. So I guess the short answer is no, nobody really knows how to judge
how much space is required for an upgrade? :)

Any logs we have are going to be a rounding error when compared to the
database itself. And buying storage last-minute because an upgrade failed
is exactly the sort of thing that a resource constrained not-for-profit
can't do. We really need to be able to plan this out long term so that we
get as much as possible out of every dollar.

> If any of the following fails, I would probably drop all the indexes
> to gain extra space, perform the upgrade, and then reindex (removing
> the old cluster, in the case it has not been upgraded with the link
> option).
>

Yeah, this sort of trial-and-error approach to getting upgrades done will
bother me, but it seems like it'll be necessary once we start growing into
whatever new home we get for the db.

Thanks very much for your time on this.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kenneth Marshall 2019-07-19 14:56:33 Re: Rearchitecting for storage
Previous Message Matthew Pounsett 2019-07-19 14:14:12 Re: Rearchitecting for storage