From: | Vijaykumar Jain <vijaykumarjain(dot)github(at)gmail(dot)com> |
---|---|
To: | Duarte Carreira <dncarreira(at)gmail(dot)com> |
Cc: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Swathi P <swathi(dot)bluepearl(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Query on postgres_fdw extension |
Date: | 2022-01-20 17:38:59 |
Message-ID: | CAM+6J97sFkWdk1xXwxZA8mvYE-MQRvGjP1YiwMn3zoFPsAH7Ow@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-general |
On Thu, 20 Jan 2022 at 21:29, Duarte Carreira <dncarreira(at)gmail(dot)com> wrote:
> Hello everyone.
>
> I don't know... realistically what do you guys see as a best/simple
> approach?
>
We implemented a custom sharding (directory sharding with lookup tables)
layer of 10 shards, but it was write local, read global.
the api was responsible for all rebalancing incase of hotspots.
other api sharding examples ...
Database Sharding: Solving Performance in a Multi-Tenant Restaurant Data
Analytics System (gotenzo.com)
<https://blog.gotenzo.com/tech/database-sharding-solving-performance-in-a-multi-tenant-restaurant-data-analytics-system>
<https://blog.gotenzo.com/tech/database-sharding-solving-performance-in-a-multi-tenant-restaurant-data-analytics-system>
although
it worked really well, when you are maintaining it on your own, it gets
really painful, much beyond id generation globally.
i will not go into the details, but in short, sharded setup is not the same
as local setup. there would be many more things that would not work as
expected
which would otherwise work really well on a standalone setup.
writes over shard may work, but you realize it is over the network, so you
can lock you table for a much longer duration and cause a much more serious
outage,
if you really wanted to have distributed writes with unique keys, you can
go with uuid i think or have your own seq generator globally (see below).
*Move ID generation out of the database to an ID generation service outside
of the database… As soon as a piece of work enters their system, an ID gets
assigned to it… and that ID generated in a way that is known to be globally
unique within their system*
A Better ID Generator For PostgreSQL | robconery
<https://rob.conery.io/2014/05/29/a-better-id-generator-for-postgresql/>
Index of /shard_manager/shard_manager-0.0.1/ (pgxn.org)
<https://api.pgxn.org/src/shard_manager/shard_manager-0.0.1/> (pretty old
but if you can use your coordinator server as a id_generator(), then you
can generate ids which are globally unique)
Sharding & IDs at Instagram. With more than 25 photos and 90 likes… | by
Instagram Engineering | Instagram Engineering (instagram-engineering.com)
<https://instagram-engineering.com/sharding-ids-at-instagram-1cf5a71e5a5c>
imho, do not try sharding manually, unless you have enough dbas to maintain
the shards, try using citus, it would make a lot of the manual stuff easier.
also, the below work arounds are bad, incase you just want to rush through
postgres=# \c localdb
You are now connected to database "localdb" as user "postgres".
localdb=#
localdb=# \dt
Did not find any relations.
localdb=# \det
List of foreign tables
Schema | Table | Server
--------+-------+---------------
public | t | remote_server
(1 row)
localdb=# \det+ t
List of foreign tables
Schema | Table | Server | FDW options |
Description
--------+-------+---------------+----------------------------------------+-------------
public | t | remote_server | (schema_name 'public', table_name 't') |
(1 row)
localdb=# \det t
List of foreign tables
Schema | Table | Server
--------+-------+---------------
public | t | remote_server
(1 row)
localdb=# create or replace function getnext() returns int as $_$ select id
FROM dblink ('dbname = remotedb', $$ select nextval('t_id_seq') $$ ) as
newtable(id int); $_$ language sql;
CREATE FUNCTION
localdb=# \c remotedb
You are now connected to database "remotedb" as user "postgres".
remotedb=# \dt t
List of relations
Schema | Name | Type | Owner
--------+------+-------+----------
public | t | table | postgres
(1 row)
remotedb=# \ds t_id_seq
List of relations
Schema | Name | Type | Owner
--------+----------+----------+----------
public | t_id_seq | sequence | postgres
(1 row)
remotedb=# \c localdb
You are now connected to database "localdb" as user "postgres".
localdb=# insert into t values (getnext(), 100);
INSERT 0 1
localdb=# insert into t values (getnext(), 100);
INSERT 0 1
localdb=# select * from t;
id | col1
----+------
11 | 4
12 | 5
13 | 100
14 | 100
(4 rows)
just my opinion, ignore it not useful.
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2022-01-20 17:57:32 | Re: BUG #17372: Altering statistics during concurrent drop can lead to a server crash |
Previous Message | Duarte Carreira | 2022-01-20 15:59:07 | Re: Query on postgres_fdw extension |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2022-01-20 17:52:27 | Re: [EXT] Re: Can we get the CTID value |
Previous Message | Garfield Lewis | 2022-01-20 17:03:26 | Re: [EXT] Re: Can we get the CTID value |