Re: Varchar vs foreign key vs enumerator - table and index size

From: Vitalii Tymchyshyn <tivv00(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Łukasz Walkowski <lukasz(dot)walkowski(at)homplex(dot)pl>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Varchar vs foreign key vs enumerator - table and index size
Date: 2013-09-03 19:00:53
Message-ID: CABWW-d16BtaUYwQTmtzyf-1R4QVgxwnOt_Di5rk5_x-Y7+KGkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Well, in older version of Hibernate it was a little tricky to handle
Postgresql Enums. Dunno if it's out of the box now.
Also adding new value is an explicit operation (much like with lookup
table). I've had quite a complex code with second connection opening to
support lookup table filling without flooding original transaction with
additional locks that could lead to deadlocks.
BTW: Does adding new value to enum adds some locks? Can a check if value
exists and adding new value be done in atomic fashion without grabbing some
global lock?
P.S. As I see, it can be a topic for good article for, say, dzone. The
problem can be quite tricky in MVCC database and choice must be done wisely.

Best regards, Vitalii Tymchyshyn

2013/9/2 Andrew Dunstan <andrew(at)dunslane(dot)net>

>
> On 09/02/2013 05:53 AM, Łukasz Walkowski wrote:
>
>> On 1 wrz 2013, at 05:10, Vitalii Tymchyshyn <tivv00(at)gmail(dot)com> wrote:
>>
>>>
>>> Well, there are some more options:
>>> a) Store int keys and do mapping in the application (e.g. with java
>>> enums). This can save you a join, that is especially useful if you are
>>> going to do paged output with limit/offset scenario. Optimizer sometimes
>>> produce suboptimal plans for join in offset/limit queries.
>>> b) Store small varchar values as keys (up to "char" type if you really
>>> want to save space) and do user display mapping in application. It's
>>> different from (a) since it's harder to mess with the mapping and values
>>> are still more or less readable with simple select. But it can be less
>>> efficient than (a).
>>> c) Do mixed approach with mapping table, loaded on start into
>>> application memory. This would be an optimization in case you get into
>>> optimizer troubles.
>>>
>>> Best regards, Vitalii Tymchyshyn
>>>
>> I'd like to leave database in readable form because before I add some new
>> queries and rest endpoints to the application, I test them as ad-hoc
>> queries using command line. So variant a) isn't good for me. Variant b) is
>> worth trying and c) is easy to code, but I still prefer having all this
>> data in database independent of application logic.
>>
>>
>
> I think the possible use of Postgres enums has been too easily written off
> in this thread. Looking at the original problem description they look like
> quite a good fit, despite the OP's skepticism. What exactly is wanted that
> can't be done with database enums? You can add new values to the type very
> simply. You can change the values of existing labels in the type slightly
> less simply, but still without any great difficulty. Things that are hard
> to do include removing labels in the set and changing the sort order,
> because those things would require processing tables where the type is
> used, unlike the simple things. But neither of these is required for
> typical use cases. For most uses of this kind they are very efficient both
> in storage and processing.
>
> cheers
>
> andrew
>

--
Best regards,
Vitalii Tymchyshyn

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Grittner 2013-09-03 19:02:00 Re: planner parameters
Previous Message Roberto Grandi 2013-09-03 16:16:13 COPY TO and VACUUM