| From: | Łukasz Walkowski <lukasz(dot)walkowski(at)homplex(dot)pl> |
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | pgsql-performance(at)postgresql(dot)org |
| Subject: | Re: Varchar vs foreign key vs enumerator - table and index size |
| Date: | 2013-08-31 17:06:01 |
| Message-ID: | A698EB68-A22F-427A-9A7D-0C251B03A7EC@homplex.pl |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-performance |
Tom,
> If you're starting to be concerned about space, it's definitely time to
> get away from this choice. Depending on what locale you're using,
> comparing varchar values can be quite an expensive operation, too.
I don't like wasting space and processing power even if more work is required to achieve this. We use pl_PL.UTF-8 as our locale.
> I think the main "pro" of this approach is that it doesn't use any
> nonstandard SQL features, so you preserve your options to move to some
> other database in the future. The main "con" is that you'd be buying into
> fairly significant rewriting of your application code, since just about
> every query involving these columns would have to become a join.
Well, I don't really think I will move from Postgresql anytime soon. It's just the best database for me. Rewriting code is one of the things I'm doing right now but before I touch database, I want to be sure that the choices I made are good.
> FWIW, I'd be inclined to just use integer not smallint. The space savings
> from smallint is frequently illusory because of alignment considerations
> --- for instance, an index on a single smallint column will *not* be any
> smaller than one on a single int column. And smallint has some minor
> usage annoyances because it's a second-class citizen in the type promotion
> hierarchy --- you may find yourself needing explicit casts to smallint
> here and there.
Ok, thats important information. Thank you.
>
> Space-wise this is going to be equivalent to the integer-foreign-key
> solution. It's much nicer from a notational standpoint, though, because
> you don't need joins --- it's likely that you'd need few if any
> application code changes to go this route. (But I'd advise doing some
> testing to verify that before you take it as a given.)
>
> You're right though that enums are not a good option if you expect
> frequent changes in the pool of allowed values. I guess the question
> is how often does that happen, in your application? Adding a new value
> from time to time isn't much of a problem unless you want to get picky
> about how it sorts relative to existing values. But you can't ever delete
> an individual enum value, and we don't support renaming them either.
> (Though if you're desperate, I believe a manual UPDATE on the pg_enum
> catalog would work for that.)
>
> Another thing to think about is whether you have auxiliary data about each
> value that might usefully be stored as additional columns in the small
> tables. The enum approach doesn't directly handle that, though I suppose
> you could still create small separate tables that use an enum column as
> primary key.
>
> regards, tom lane
So, I'll go for enumerators for device type, eventtype and eventsource as those columns are quite stable. For browser and operating system I'll do external tables.
Thank you - any additional tips are welcome.
Reagards,
Lukasz Walkowski
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Craig James | 2013-09-01 01:31:06 | Re: Varchar vs foreign key vs enumerator - table and index size |
| Previous Message | Tom Lane | 2013-08-31 15:20:47 | Re: Varchar vs foreign key vs enumerator - table and index size |