Hardware advice

From: Alban Hertroys <haramrae(at)gmail(dot)com>
To: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Hardware advice
Date: 2018-01-22 20:46:56
Message-ID: B981F83B-25D0-4936-9A3B-73EB39AF49F0@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi all,

At work we are in the process of setting up a data-warehouse using PG 10. I'm looking for a suitable server, but I hardly know anything about server-grade hardware.

Hence, we're looking for some advice, preferably with some opportunity to discuss our situation and possibly things we did not take into account etc. A face to talk to would be appreciated.
Who provides that in or near the eastern border of the Netherlands?

More details:

We're planning to deploy on bare-metal hardware, with a fallback server with similar or lesser specs for emergencies and upgrades and perhaps some (read-only) load balancing of different kinds of loads.

The server will be accessed for reporting and ETL (or ELT) mostly. Both reporting servers (test/devel and production) are configured for at most 40 agents, so that's max 40 connections each to the warehouse for now. So far, we haven't reached that number in real loads, but reports are requested ~40,000 times a month (we measure HTTP requests minus static content).

We will also be doing ETL of (very) remote (SAP) tables to the warehouse server; in what we got so far in our limited test environment we have tables of over 30GB, most of which is from the last 4 to 5 years.

The biggy though is that we also plan to store factory process measurements on this server (temperatures, pressures, etc. at 5s intervals).
Part of one factory has already been writing that data to a different server, but that's already 4.3 billion records (140GB) for about a year of measuring and that's not even half of the factory. We will be required to retain 10-15 years of data from several factories (on the short term, at least 2). The expectancy is that this will grow to ~15TB for our factory alone.

We also want to keep our options for growth of this data warehouse open. There are some lab databases, for example, that currently exist as two separate brand database servers (with different major versions of the lab software, so there are design differences as well), that aren't exactly small either.

I have been drooling over those shiny new AMD Epyc processors, which look certainly adequate with a truckload of memory and a good RAID-10 array and some SSD(s) for the WAL, but it's really hard to figure out how many cores and memory we need. Sure, 2 7601's at 64 cores and 4TB of memory (if that's even for sale) would probably do the trick, but even I think that might be going a little overboard ;)

Oh yeah, apparently we're married to HP or something… At least, IT management told me to look at their offerings.

Regards,

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Martin Moore 2018-01-22 21:18:00 Changing locale/charset
Previous Message Tom Lane 2018-01-22 20:15:55 Re: Using random() in update produces same random value for all