From: | Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Andrew Kane <andrew(at)chartkick(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: A space-efficient, user-friendly way to store categorical data |
Date: | 2018-02-11 23:24:29 |
Message-ID: | CAA8=A7-df9JSaVqHy2bRJBfNP=NjqdfmKHMPbPcM6Cs_3x7RoQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Feb 12, 2018 at 9:10 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Andrew Kane <andrew(at)chartkick(dot)com> writes:
>> A better option could be a new "dynamic enum" type, which would have
>> similar storage requirements as an enum, but instead of labels being
>> declared ahead of time, they would be added as data is inserted.
>
> You realize, of course, that it's possible to add labels to an enum type
> today. (Removing them is another story.)
>
> You haven't explained exactly what you have in mind that is going to be
> able to duplicate the advantages of the current enum implementation
> without its disadvantages, so it's hard to evaluate this proposal.
>
This sounds rather like the idea I have been tossing around in my head
for a while, and in sporadic discussions with a few people, for a
dictionary object. The idea is to have an append-only list of labels
which would not obey transactional semantics, and would thus help us
avoid the pitfalls of enums - there wouldn't be any rollback of an
addition. The use case would be for a jsonb representation which
would replace object keys with the oid value of the corresponding
dictionary entry rather like enums now. We could have a per-table
dictionary which in most typical json use cases would be very small,
and we know from some experimental data that the compression in space
used from such a change would often be substantial.
This would have to be modifiable dynamically rather than requiring
explicit additions to the dictionary, to be of practical use for the
jsonb case, I believe.
I hadn't thought about this as a sort of super enum that was usable
directly by users, but it makes sense.
I have no idea how hard or even possible it would be to implement.
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2018-02-11 23:42:08 | Re: [HACKERS] A misconception about the meaning of 'volatile' in GetNewTransactionId? |
Previous Message | Tom Lane | 2018-02-11 22:40:47 | Re: A space-efficient, user-friendly way to store categorical data |