Re: Array columns vs normalized table

From: Peter Hunsberger <peter(dot)hunsberger(at)gmail(dot)com>
To: Lee Hachadoorian <lee(dot)hachadoorian(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Array columns vs normalized table
Date: 2010-03-02 17:57:00
Message-ID: cc159a4a1003020957p41e77497rc08568bf17003912@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Mar 2, 2010 at 11:21 AM, Lee Hachadoorian
<lee(dot)hachadoorian(at)gmail(dot)com> wrote:
> I work with state labor data which is reported to us in the form
>
>        industry, year, quarter1, quarter2, quarter3, quarter4
>
> where each quarter represents an employment count. Obviously, this can
> be normalized to
>
>        industry, year, quarter, employment
>
> Can anyone comment on, or point to me to an article or discussion
> regarding, why one would use an array column instead of normalizing
> the data? That is, would there be any benefit to storing it as
>
>        industry int, year smallint, employment int[ ]
>
> where the last column would be a four element array with data for the
> four quarters.

I think you might want to step back and ask yourself why you'd want to
normalize this data at all? Unless you are trying to operate on all 4
columns at once (comparisons or aggregation, etc.) I see no reason to.
It will take more space and give you no new capabilities. If you
need do need to operate on all 4 quarters simultaneously then you may
very well want to normalize, but in that case there's certainly no
advantage in going to an array type. Personally, the only reason I'd
see for ever using an array type is when you have many very closely
related values that would cause some huge number of rows if the data
is stored normalized

--
Peter Hunsberger

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Scott Bailey 2010-03-02 18:00:13 Re: Array columns vs normalized table
Previous Message Pavel Stehule 2010-03-02 17:54:47 Re: Array columns vs normalized table