From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
---|---|
To: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Global Sequences |
Date: | 2012-10-15 21:33:49 |
Message-ID: | CA+U5nMLSh4fttA4BhAknpCE-iAWgK+BG-_wuJS=EAcx7hTYn-Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Sequences, as defined by SQL Standard, provide a series of unique
values. The current implementation on PostgreSQL isolates the
generation mechanism to only a single node, as is common on many
RDBMS.
For sharded or replicated systems it forces people to various hackish
mechanisms in user space for emulating a global or cluster-wide
sequence.
The solution to this problem is an in-core solution that allows
coordination between nodes to guarantee unique values.
There are a few options
1) Manual separation of the value space, so that N1 has 50% of
possible values and N2 has 50%. That has problems when we reconfigure
the cluster, and requires complex manual reallocation of values. So it
starts good but ends badly.
2) Automatic separation of the value space. This could mimic the
manual operation, so it does everything for you - but thats just
making a bad idea automatic
3) Lazy allocation from the value space. When a node is close to
running out of values, it requests a new allocation and coordinates
with all nodes to confirm the new allocation is good.
(3) is similar to the way values are allocated currently, so the only
addition is a multi-node allocation algorithm to allocate new value
ranges. That seems to be the best way to go. Any implementation for
that presumes how the node configuration and inter-node transport
works, which we would like to keep open for use by various external
tools.
So, proposal is to allow nextval() allocation to access a plugin,
rather than simply write a WAL record and increment. If the plugin is
loaded all sequences call it (not OIDs).
We'd call this the Global Sequence API. The API looks like it would be
pretty stable to me. We can put something in contrib if required to
prove it works, as well as providing some optional caching to further
avoid performance effects from being noted.
Note that if you did just want to implement manual separation of
ranges then this would also make it slightly easier, so this approach
supports all flavors, which a more hardcoded solution would not.
Any comments before I demonstrate a patch to do this?
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Phil Sorber | 2012-10-15 21:40:15 | Re: [WIP] pg_ping utility |
Previous Message | Andres Freund | 2012-10-15 21:32:07 | Re: [WIP] pg_ping utility |