Quick Links

Re: Rearchitecting for storage

From:	Matthew Pounsett <matt(at)conundrum(dot)com>
To:	pgsql-general(at)lists(dot)postgresql(dot)org
Subject:	Re: Rearchitecting for storage
Date:	2019-07-19 15:06:23
Message-ID:	CAAiTEH_mz4K4gxPGsFcYRd6AdzGRkPmRTBsMXTG49XhHCUuNmQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Thu, 18 Jul 2019 at 09:44, Matthew Pounsett <matt(at)conundrum(dot)com> wrote:

>
> I've recently inherited a database that is dangerously close to outgrowing
> the available storage on its existing hardware. I'm looking for (pointers
> to) advice on scaling the storage in a financially constrained
> not-for-profit.
>

Thanks for your replies, everyone. Here's a quick summary of what I've got
out of this.

Although nobody really addressed the core question of the performance
tradeoffs in different storage architectures, perhaps the fact that nobody
mentioned them means there really aren't any. We'll proceed on the
assumption that externally attached storage really doesn't make a
difference. NAS storage seems like a poor choice to me, for performance
reasons, and nobody's really said anything to disabuse me of that notion.

We're going to have a look at the relative costs of single-head solutions
as well as dual head (server + jbod) setups, and see what gets us the most
growth for the least cost. We'll plan for enough storage to get us five
years of growth, and just accept that maybe in the 5th year we won't be
able to do in-place upgrades without dropping indexes.

Horizontal scalability through range partitioning sounds interesting, but I
don't think it's a cost-effective solution for us right now. As long as
it's possible for the db to fit in a single server (or jbod) using
commodity hardware, the incremental cost of adding more chassis (and
therefore more motherbaords, more CPUs, more memory) isn't offset by a
reduced cost anywhere else (e.g. using cheaper drives). And that's not
even accounting for the increased operational cost of coordinating the DB
across multiple servers. It could be a useful approach if DB growth
outpaces historical averages and we need to add hardware before a normal
replacement cycle. It could also be useful at the end of that replacement
cycle if DB growth has outpaced commodity hardware improvements, and single
server solutions are no longer viable.

The DB server we inherited is currently a single instance, but once we
expand and have replication in place I'll have to do some testing to see if
LVM compression gives us any performance boost (or incurs an acceptable
performance cost). The big question there is whether the processing
required to do the compression is faster than the difference in read times
on the disk... I think that might be dependent on the data and how it's
accessed. It certainly seems like it could give us some benefits, but I
don't think it's an experiment I want to attempt with only a single
production copy of the DB; the downtime required to rebuild the DB server
for A+B comparisons would be unacceptable.

Thanks again everyone. This has been educational.

In response to

Rearchitecting for storage at 2019-07-18 13:44:04 from Matthew Pounsett

Browse pgsql-general by date

	From	Date	Subject
Next Message	Peter J. Holzer	2019-07-19 15:25:12	Re: Rearchitecting for storage
Previous Message	Kenneth Marshall	2019-07-19 14:56:33	Re: Rearchitecting for storage