From: | Melvin Davidson <melvin6925(at)gmail(dot)com> |
---|---|
To: | Olivier Macchioni <olivier(dot)macchioni(at)wingo(dot)ch> |
Cc: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: Dynamic Enum? |
Date: | 2017-12-21 14:17:15 |
Message-ID: | CANu8FiwJZ9neCb8DV8czR-8p-y4Rj38EW0aRW71GJ9bJ2czf6g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Thu, Dec 21, 2017 at 5:34 AM, Olivier Macchioni <
olivier(dot)macchioni(at)wingo(dot)ch> wrote:
> Hello,
>
> First post on the list, please be indulgent :)
>
> I'm working on a fairly large DB (for me) - 23 GB data, 42 GB indexes, 100
> M rows for my biggest table - and trying to find possible optimisations on
> the storage requirements... and hopefully trying to keep more indexes in
> RAM.
>
> This DB is actually the import of "events" on a sliding window of 30 days,
> and I have no control over the events which are generated. I have control
> of the import script though.
>
> Many of the columns have the following characteristics:
> - VARCHAR
> - low cardinality (typically < 100 distinct values)
> - but I can see new values appearing "at any time" when importing data
> from external systems. I don't expect the cardinality to grow significantly
> though.
>
> The naive storage of those columns is quite demanding when compared to the
> amount of information they carry, and I'm looking at solutions to optimise
> this. Obviously I could:
>
> - use ENUMs to reduce the storage space to 4 bytes on disk (cf
> https://www.postgresql.org/docs/current/static/datatype-enum.html)
> assuming I managed the ENUMs by adding new values when needed. This would
> probably shrink the indexes significantly as well. It may have an impact on
> the comparison of values as well.
>
> - normalize the DB by adding another table and a FOREIGN KEY - the
> management of this table could be done via triggers for instance, with a
> cost in complexity (triggers, applications accessing the DB, ...)
>
> I would find much more elegant to use a datatype where my VARCHARs would
> be be internally stored as a SMALLINT (or similar), indexed as SMALLINT,
> while still being able to be externally seen as if it was a VARCHAR
> (comparison, ORM bindings, ...)
>
> I didn't find any datatype which would work like this :(
>
> Does anyone know of such a solution ?
>
> Thank you,
>
> Olivier
>
>- use ENUMs to reduce the storage space
First, please include the PostgreSQL version and O/S when communicating
with this list
Next, I recommend you avoid ENUMs and instead use Foreign Keys. ENUMs are
old tech from before FK's were available
and are a PIA to manage/maintain. FK's are a lot easier and simpler.
--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
From | Date | Subject | |
---|---|---|---|
Next Message | Olivier Macchioni | 2017-12-21 14:21:23 | Re: Dynamic Enum? |
Previous Message | Magnus Hagander | 2017-12-21 13:41:32 | Re: Migration to PGLister - After |