Re: Largest PG database known to man!

From: Chris Travers <chris(dot)travers(at)gmail(dot)com>
To: Mark Jones <mark(dot)jones(at)enterprisedb(dot)com>
Cc: John R Pierce <pierce(at)hogranch(dot)com>, Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Largest PG database known to man!
Date: 2013-10-02 03:09:50
Message-ID: CAKt_ZftqZHu98kQ7GwGPGoKGVs7RcbxHt7Gd4PdWPi_iR-D6fA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Oct 1, 2013 at 3:00 PM, Mark Jones <mark(dot)jones(at)enterprisedb(dot)com>wrote:

> Thanks for your quick response John.
>
> From the limited information, it is mostly relational.
> As for usage patterns, I do not have that yet.
> I was just after a general feel of what is out there size wise.
>

Usage patterns are going to be critical here. There is a huge difference
between a large amount of data being used in an OLTP workflow than a
DSS/OLAP workflow. Additionally, I am concerned your indexes are going to
be very large. Now, depending on your usage pattern, breaking things down
carefully regarding tablespace and partial indexes may be enough. However,
keep in mind that no table can be larger than 32TB. At any rate, no matter
what solution you use, I don't see a generally tuned database being what
you want (which means you are tuning for workflow).

Now, I think your big limits in OLTP are going to be max table size (32TB)
and index size. These can all be managed (and managed carefully) but they
are limits.

For OLAP you have a totally different set of concerns, and since you are
talking about aggregating a lot of data, vanilla PostgreSQL is going to be
a pain to get working as it is. On the other hand OLAP and large db mixed
workloads is where Postgres-XC might really shine. The complexity costs
there will likely be worth it in removing limitations on disk I/O and lack
of intraquery parallelism.

200TB is a lot of data. 400TB is twice that. Either way you are going to
have a really complex set of problems to tackle regardless of what solution
you choose.

I have heard of db sizes in the 30-100TB range on PostgreSQL even before
Postgres-XC. I am not sure beyond that.

Best Wishes,
Chris Travers

>
> Regards
>
>
> ----------------------------------------------------
> Mark Jones
> Principal Sales Engineer Emea
>
>
> http://www.enterprisedb.com/
>
> Email: Mark(dot)Jones(at)enterprisedb(dot)com
> Tel: 44 7711217186
> Skype: Mxjones121
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On 01/10/2013 22:56, "John R Pierce" <pierce(at)hogranch(dot)com> wrote:
>
> >On 10/1/2013 2:49 PM, Mark Jones wrote:
> >> We are currently working with a customer who is looking at a database
> >> of between 200-400 TB! They are after any confirmation of PG working
> >> at this size or anywhere near it.
> >
> >
> >is that really 200-400TB of relational data, or is it 199-399TB of bulk
> >data (blobs or whatever) interspersed with some relational metadata?
> >
> >what all is the usage pattern of this data? that determines the
> >feasibility of something far more than just the raw size.
> >
> >
> >
> >
> >--
> >john r pierce 37N 122W
> >somewhere on the middle of the left coast
> >
> >
> >
> >--
> >Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> >To make changes to your subscription:
> >http://www.postgresql.org/mailpref/pgsql-general
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

--
Best Wishes,
Chris Travers

Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor
lock-in.
http://www.efficito.com/learn_more.shtml

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Sergey Konoplev 2013-10-02 03:48:10 Re: Postgres replication question :- One master 2 slaves 9.0.10
Previous Message Tim Uckun 2013-10-02 02:54:18 Timestamp with and without timezone conversion confusion.