From: | Andrew Gould <andrewlylegould(at)gmail(dot)com> |
---|---|
To: | Ow Mun Heng <ow(dot)mun(dot)heng(at)wdc(dot)com> |
Cc: | Sam Mason <sam(at)samason(dot)me(dot)uk>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: Putting many related fields as an array |
Date: | 2009-05-12 13:32:34 |
Message-ID: | d356c5630905120632g53c92fa4wd12aeba1e58576d6@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Tue, May 12, 2009 at 7:06 AM, Ow Mun Heng <ow(dot)mun(dot)heng(at)wdc(dot)com> wrote:
> -----Original Message-----
> From: pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-
> On Tue, May 12, 2009 at 01:23:14PM +0800, Ow Mun Heng wrote:
> >> | sum of count | sum_of_count_squared | qty | qty < 100 | qty < 500 |
> >>
> >>
> >> I'm thinking of lumping them into 1 column via an array instead of into
> >> 5 different columns. Not sure how to go about this, hence the email to
> >> the list.
>
> >The normal array constructor should work:
> >
> > SELECT ARRAY[MIN(v),MAX(v),AVG(v),STDEV(v)]
> > FROM (VALUES (1),(3),(4)) x(v);
> >
> >Not sure why this is better than using separate columns though. Maybe a
> >new datatype and a custom aggregate would be easier to work with?
>
> The issue here is the # of columns needed to populate the table.
>
> The table I'm summarizing has close to between 50 to 100+ columns, if the
> 1:5x is used as a yardstick, then the table will get awfully wide quickly.
>
> I need to know how to do it first, then test accordingly for performance
> and
> corner cases.
>
>
I apologize for coming into this conversation late. I used to do analysis
of a public use data flat file that had one row per patient and up to 24
diagnosis codes, each in a different column. Is this analogous to your
situation? I found it was worth the effort to convert the flat file into a
relational data model where the patients' diagnosis codes were in one column
in a separate table. This model also makes more complex analysis easier.
Since there were several types of fields that needed to be combined into
their own tables, I found it took less time to convert the flat file to the
relational model using a script prior to importing the data into the
database server. A Python script would read the original file and create 5
clean, tab-delimited files that were ready to be imported.
I hope this helps.
Andrew
From | Date | Subject | |
---|---|---|---|
Next Message | Adrian Klaver | 2009-05-12 13:59:16 | Re: Unable to access table named "user" |
Previous Message | Sam Mason | 2009-05-12 13:31:20 | Re: Postgres BackUp and Restore: ERROR: duplicate key violates unique constraint "pg_largeobject_loid_pn_index" |