Re: PG Sharding

From: Matej <gmatej(at)gmail(dot)com>
To: Thomas Boussekey <thomas(dot)boussekey(at)gmail(dot)com>
Cc: Melvin Davidson <melvin6925(at)gmail(dot)com>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: PG Sharding
Date: 2018-01-29 15:49:19
Message-ID: CAJB+8mag3tq8O5x-h2B6WBuhbbzMO-QxM-f4K5N1vh=ntq4rFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Thomas.

Thanks.

Also looked at those solutions:
- PGXL Am a ltille afraid we would be the test dummies. Did not hear of
many production installs.
- Citus seems heavily limited scalability vise, because of the master node
design.

Regarding partitioning we are considering ourselves pg_pathman. Was hoping
on PG10 partitioning but currently not really many changes performance
vise.

Overall we are still considering manual APP/sharding as this seems to be
the most scalable approach which least added latency. The builtin solutions
seems to introduce extra lag and I am afraid of what to do when something
goes wrong. then it's not a black box anymore and you have to study the
details.

For node loss we plan a master -slave setup, and there will not be so many
cross shard queries.

BR
Matej

2018-01-29 16:15 GMT+01:00 Thomas Boussekey <thomas(dot)boussekey(at)gmail(dot)com>:

> Hello,
>
> Facing the same situation, I'm considering 3 solutions:
> - Sharding with postgres_xl (waiting for a Pg10 release)
> - Sharding with citusdata (Release 7.2, compatible with Pg10 and
> pg_partman, seems interesting)
> - Partitioning with PG 10 native partitioning or pg_partman
>
> With colleagues, we have tested the 3 scenarios.
> Sharding looks interesting, but you have to apprehend its behaviour in
> case of node loss, or cross-node queries.
>
> Thomas
>
> 2018-01-29 15:44 GMT+01:00 Melvin Davidson <melvin6925(at)gmail(dot)com>:
>
>>
>>
>> On Mon, Jan 29, 2018 at 9:34 AM, Matej <gmatej(at)gmail(dot)com> wrote:
>>
>>> Hi Everyone.
>>>
>>> We are looking at a rather large fin-tech installation. But as
>>> scalability requirements are high we look at sharding of-course.
>>>
>>> I have looked at many sources for Postgresql sharding, but we are a
>>> little confused as to shared with schema or databases or both.
>>>
>>>
>>> So far our understanding:
>>>
>>> *SCHEMA.*
>>>
>>> PROS:
>>> - seems native to PG
>>> - backup seems easier
>>> - connection pooling seems easier, as you can use same connection
>>> between shard.
>>>
>>> CONS:
>>> - schema changes seems litlle more complicated
>>> - heard of backup and maintenance problems
>>> - also some caching problems.
>>>
>>> *DATABASE:*
>>>
>>> PROS:
>>> - schema changes litlle easier
>>> - backup and administration seems more robust
>>>
>>> CONS:
>>> - heard of vacuum problems
>>> - connection pooling is hard, as 100 shards would mean 100 pools
>>>
>>>
>>> So what is actually the right approach? If anyone could shed some
>>> light on my issue.
>>>
>>> *Thanks*
>>>
>>>
>>>
>>
>> *You might also want to consider GridSQL. IIRC it was originally
>> developed by EnterpriseDB. I saw a demo of it a few years ago and it was
>> quite impressive, *
>> *but I've had no interaction with it since, so you will have to judge for
>> yourself.*
>>
>>
>> *https://sourceforge.net/projects/gridsql/?source=navbar
>> <https://sourceforge.net/projects/gridsql/?source=navbar>*
>>
>> --
>> *Melvin Davidson*
>> I reserve the right to fantasize. Whether or not you
>> wish to share my fantasy is entirely up to you.
>>
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Robert Zenz 2018-01-29 15:50:36 Re: Information on savepoint requirement within transctions
Previous Message David G. Johnston 2018-01-29 15:46:06 Re: Information on savepoint requirement within transctions