From: | Peter Hunsberger <peter(dot)hunsberger(at)gmail(dot)com> |
---|---|
To: | Lee Hachadoorian <lee(dot)hachadoorian(at)gmail(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Array columns vs normalized table |
Date: | 2010-03-02 17:57:00 |
Message-ID: | cc159a4a1003020957p41e77497rc08568bf17003912@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Tue, Mar 2, 2010 at 11:21 AM, Lee Hachadoorian
<lee(dot)hachadoorian(at)gmail(dot)com> wrote:
> I work with state labor data which is reported to us in the form
>
> industry, year, quarter1, quarter2, quarter3, quarter4
>
> where each quarter represents an employment count. Obviously, this can
> be normalized to
>
> industry, year, quarter, employment
>
> Can anyone comment on, or point to me to an article or discussion
> regarding, why one would use an array column instead of normalizing
> the data? That is, would there be any benefit to storing it as
>
> industry int, year smallint, employment int[ ]
>
> where the last column would be a four element array with data for the
> four quarters.
I think you might want to step back and ask yourself why you'd want to
normalize this data at all? Unless you are trying to operate on all 4
columns at once (comparisons or aggregation, etc.) I see no reason to.
It will take more space and give you no new capabilities. If you
need do need to operate on all 4 quarters simultaneously then you may
very well want to normalize, but in that case there's certainly no
advantage in going to an array type. Personally, the only reason I'd
see for ever using an array type is when you have many very closely
related values that would cause some huge number of rows if the data
is stored normalized
--
Peter Hunsberger
From | Date | Subject | |
---|---|---|---|
Next Message | Scott Bailey | 2010-03-02 18:00:13 | Re: Array columns vs normalized table |
Previous Message | Pavel Stehule | 2010-03-02 17:54:47 | Re: Array columns vs normalized table |