Quick Links

Re: PG Sharding

From:	Thomas Boussekey <thomas(dot)boussekey(at)gmail(dot)com>
To:	Matej <gmatej(at)gmail(dot)com>
Cc:	Melvin Davidson <melvin6925(at)gmail(dot)com>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject:	Re: PG Sharding
Date:	2018-01-29 16:01:03
Message-ID:	CALUeYmdr-HqoBp0a6jXyzvF=y4KNmddGGVNjfPebJKi3DkzNZw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Hello Matej,

I found some interesting implementation of postgres_XL at TenCent(WeChat)
and Javelin. You can find video capture of conferences of IT people from
these companies. Moreover, I attended to PgConf.eu at Warsaw in October,
and The ESA (European Space Agency) made a lightning talk on their Gaïa
project using a 8-datanode cluster.

I agree with you about the SPOF for the master on the citusdata
architecture. Yet, implementation is very easy, and it's an extension in
Pg10! But I had to fire many foreign into my data model to be able to
install my tables into citusdata.

2 years ago, I was looking for a partitioning extension, pg_partman was
mature, whereas pg_pathman was in version 0.4 and many issues in their
github were written in Cyrillic, and I'm French ;-)... So I had a closer
look at pg_partman.
I'm using pg_partman in production now.

2018-01-29 16:49 GMT+01:00 Matej <gmatej(at)gmail(dot)com>:

> Hi Thomas.
>
> Thanks.
>
> Also looked at those solutions:
> - PGXL Am a ltille afraid we would be the test dummies. Did not hear of
> many production installs.
> - Citus seems heavily limited scalability vise, because of the master node
> design.
>
> Regarding partitioning we are considering ourselves pg_pathman. Was
> hoping on PG10 partitioning but currently not really many changes
> performance vise.
>
> Overall we are still considering manual APP/sharding as this seems to be
> the most scalable approach which least added latency. The builtin solutions
> seems to introduce extra lag and I am afraid of what to do when something
> goes wrong. then it's not a black box anymore and you have to study the
> details.
>
> For node loss we plan a master -slave setup, and there will not be so many
> cross shard queries.
>
> BR
> Matej
>
>
> 2018-01-29 16:15 GMT+01:00 Thomas Boussekey <thomas(dot)boussekey(at)gmail(dot)com>:
>
>> Hello,
>>
>> Facing the same situation, I'm considering 3 solutions:
>> - Sharding with postgres_xl (waiting for a Pg10 release)
>> - Sharding with citusdata (Release 7.2, compatible with Pg10 and
>> pg_partman, seems interesting)
>> - Partitioning with PG 10 native partitioning or pg_partman
>>
>> With colleagues, we have tested the 3 scenarios.
>> Sharding looks interesting, but you have to apprehend its behaviour in
>> case of node loss, or cross-node queries.
>>
>> Thomas
>>
>> 2018-01-29 15:44 GMT+01:00 Melvin Davidson <melvin6925(at)gmail(dot)com>:
>>
>>>
>>>
>>> On Mon, Jan 29, 2018 at 9:34 AM, Matej <gmatej(at)gmail(dot)com> wrote:
>>>
>>>> Hi Everyone.
>>>>
>>>> We are looking at a rather large fin-tech installation. But as
>>>> scalability requirements are high we look at sharding of-course.
>>>>
>>>> I have looked at many sources for Postgresql sharding, but we are a
>>>> little confused as to shared with schema or databases or both.
>>>>
>>>>
>>>> So far our understanding:
>>>>
>>>> *SCHEMA.*
>>>>
>>>> PROS:
>>>> - seems native to PG
>>>> - backup seems easier
>>>> - connection pooling seems easier, as you can use same connection
>>>> between shard.
>>>>
>>>> CONS:
>>>> - schema changes seems litlle more complicated
>>>> - heard of backup and maintenance problems
>>>> - also some caching problems.
>>>>
>>>> *DATABASE:*
>>>>
>>>> PROS:
>>>> - schema changes litlle easier
>>>> - backup and administration seems more robust
>>>>
>>>> CONS:
>>>> - heard of vacuum problems
>>>> - connection pooling is hard, as 100 shards would mean 100 pools
>>>>
>>>>
>>>> So what is actually the right approach? If anyone could shed some
>>>> light on my issue.
>>>>
>>>> *Thanks*
>>>>
>>>>
>>>>
>>>
>>> *You might also want to consider GridSQL. IIRC it was originally
>>> developed by EnterpriseDB. I saw a demo of it a few years ago and it was
>>> quite impressive, *
>>> *but I've had no interaction with it since, so you will have to judge
>>> for yourself.*
>>>
>>>
>>> *https://sourceforge.net/projects/gridsql/?source=navbar
>>> <https://sourceforge.net/projects/gridsql/?source=navbar>*
>>>
>>> --
>>> *Melvin Davidson*
>>> I reserve the right to fantasize. Whether or not you
>>> wish to share my fantasy is entirely up to you.
>>>
>>
>>
>

In response to

Re: PG Sharding at 2018-01-29 15:49:19 from Matej

Browse pgsql-general by date

	From	Date	Subject
Next Message	David G. Johnston	2018-01-29 16:09:41	Re: Information on savepoint requirement within transctions
Previous Message	Tom Lane	2018-01-29 16:00:20	Re: Information on savepoint requirement within transctions