From: | Teodor Sigaev <teodor(at)sigaev(dot)ru> |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)postgreSQL(dot)org> |
Subject: | Pluggable toaster |
Date: | 2021-12-30 16:40:09 |
Message-ID: | 224711f9-83b7-a307-b17f-4457ab73aa0a@sigaev.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi!
We are working on custom toaster for JSONB [1], because current TOAST is
universal for any data type and because of that it has some disadvantages:
- "one toast fits all" may be not the best solution for particular
type or/and use cases
- it doesn't know the internal structure of data type, so it cannot
choose an optimal toast strategy
- it can't share common parts between different rows and even
versions of rows
Modification of current toaster for all tasks and cases looks too
complex, moreover, it will not works for custom data types. Postgres
is an extensible database, why not to extent its extensibility even
further, to have pluggable TOAST! We propose an idea to separate
toaster from heap using toaster API similar to table AM API etc.
Following patches are applicable over patch in [1]
1) 1_toaster_interface_v1.patch.gz
https://github.com/postgrespro/postgres/tree/toaster_interface
Introduces syntax for storage and formal toaster API. Adds column
atttoaster to pg_attribute, by design this column should not be equal to
invalid oid for any toastable datatype, ie it must have correct oid for
any type (not column) with non-plain storage. Since toaster may support
only particular datatype, core should check correctness of toaster set
by toaster validate method. New commands could be found in
src/test/regress/sql/toaster.sql
On-disk toast pointer structure now has one more possible struct -
varatt_custom with fixed header and variable tail which uses as a
storage for custom toasters. Format of built-in toaster is kept to allow
simple pg_upgrade logic.
Since toaster for column could be changed during table's lifetime we had
two options about toaster's drop operation:
- if column's toaster has been changed, then we need to re-toast all
values, which could be extremely expensive. In any case,
functions/operators should be ready to work with values toasted by
different toasters, although any toaster should execute simple
toast/detoast operation, which allows any existing code to
work with the new approach. Tracking dependency of toasters and
rows looks as bad idea.
- disallow drop toaster. We don't believe that there will be many
toasters at the same time (number of AM isn't very high too and
we don't believe that it will be changed significantly in the near
future), so prohibition of dropping of toaster looks reasonable.
In this patch set we choose second option.
Toaster API includes get_vtable method, which is planned to access the
custom toaster features which isn't covered by this API. The idea is,
that toaster returns some structure with some values and/or pointers to
toaster's methods and caller could use it for particular purposes, see
patch 4). Kind of structure identified by magic number, which should be
a first field in this structure.
Also added contrib/dummy_toaster to simplify checking.
psql/pg_dump are modified to support toaster object concept.
2) 2_toaster_default_v1.patch.gz
https://github.com/postgrespro/postgres/tree/toaster_default
Built-in toaster implemented (with some refactoring) uisng toaster API
as generic (or default) toaster. dummy_toaster here is a minimal
workable example, it saves value directly in toast pointer and fails if
value is greater than 1kb.
3) 3_toaster_snapshot_v1.patch.gz
https://github.com/postgrespro/postgres/tree/toaster_snapshot
The patch implements technology to distinguish row's versions in toasted
values to share common parts of toasted values between different
versions of rows
4) 4_bytea_appendable_toaster_v1.patch.gz
https://github.com/postgrespro/postgres/tree/bytea_appendable_toaster
Contrib module implements toaster for non-compressed bytea columns,
which allows fast appending to existing bytea value. Appended tail
stored directly in toaster pointer, if there is enough place to do it.
Note: patch modifies byteacat() to support contrib toaster. Seems, it's
looks ugly and contrib module should create new concatenation function.
We are open for any questions, discussions, objections and advices.
Thank you.
Peoples behind:
Oleg Bartunov
Nikita Gluhov
Nikita Malakhov
Teodor Sigaev
[1]
https://www.postgresql.org/message-id/flat/de83407a-ae3d-a8e1-a788-920eb334f25b(at)sigaev(dot)ru
<https://www.postgresql.org/message-id/flat/de83407a-ae3d-a8e1-a788-920eb334f25b(at)sigaev(dot)ru>
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/
Attachment | Content-Type | Size |
---|---|---|
4_bytea_appendable_toaster_v1.patch.gz | application/gzip | 5.7 KB |
3_toaster_snapshot_v1.patch.gz | application/gzip | 6.3 KB |
2_toaster_default_v1.patch.gz | application/gzip | 26.4 KB |
1_toaster_interface_v1.patch.gz | application/gzip | 41.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Bossart, Nathan | 2021-12-30 16:50:53 | Re: Strange path from pgarch_readyXlog() |
Previous Message | Tom Lane | 2021-12-30 16:24:55 | Re: Autovacuum and idle_session_timeout |