From: | Stephen Frost <sfrost(at)snowman(dot)net> |
---|---|
To: | John R Pierce <pierce(at)hogranch(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Largest PG database known to man! |
Date: | 2013-10-02 01:53:23 |
Message-ID: | 20131002015323.GI2706@tamriel.snowman.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
* John R Pierce (pierce(at)hogranch(dot)com) wrote:
> if we assume the tables average 1KB/record (which is a fairly large
> record size even including indexing), you're looking at 400 billion
> records. if you can populate these at 5000 records/second, it
> would take 2.5 years of 24/7 operation to populate that.
5000 1KB records per second is only 5MB/s or so, which is really quite
slow.. I can't imagine that they'd load all of this data by doing a
commit for each record and you could load a *huge* amount of data *very*
quickly, in parallel, by using either unlogged tables or wal_level =
minimal and creating the tables in the same transaction that's loading
them.
> this sort of big data system is probably more suitable for something
> like hadoop+mongo or whatever on a cloud of 1000 nodes, not a
> monolithic SQL relational database.
Or a federated PG database using FDWs..
Sadly, I've not personally worked with a data system on the 100+TB range
w/ PG (we do have a Hadoop environment along that scale) but I've built
systems as large as 25TB which, built correctly, work very well. Still,
I don't think I'd recommend building a single-image PG database on that
scale but rather would shard it.
Thanks,
Stephen
From | Date | Subject | |
---|---|---|---|
Next Message | John R Pierce | 2013-10-02 02:05:09 | Re: Largest PG database known to man! |
Previous Message | Jaime Casanova | 2013-10-02 01:15:13 | Re: Postgres replication question :- One master 2 slaves 9.0.10 |