Re: Advice for using integer arrays?

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Michael Heaney <mheaney(at)jcvi(dot)org>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Advice for using integer arrays?
Date: 2015-01-06 19:19:56
Message-ID: CAMkU=1xRNXZOOEB0yx1_eTsTc5RDEN+Gv2zW0c=-w02Xx1V+xQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Jan 6, 2015 at 9:09 AM, Michael Heaney <mheaney(at)jcvi(dot)org> wrote:

> I'm fairly new to Postgres, and have a design issue for which an array of
> integers might be a good solution. But I'd like to hear from the experts
> before proceeding down this path.
>
> Essentially, I'm trying to model the relationship between a group of
> biological samples and their genes. Each sample (there are ~10K of them
> now, with more coming) will have about 30,000 genes. Conversely, a
> particular gene may be present in almost all samples.
>
> So I've created the following table to handle the many-to-many
> relationship:
>
> create table sample_gene (id serial, sample_id int, gene_id int);
>

What is the value of having this table at all? It doesn't seem to contain
anything informative, like an allele identifier, a resequence, or a copy
number variation. If you are just trying to record the fact that a gene
was present in that sample, perhaps it would be better to instead record
the genes have been deleted, rather than the ones that have not been
deleted? That would probably be a much smaller list.

> create table sample_gene_array (id serial, sample_id int, gene_id int []
> );
>
> So now the table data looks like this:
>
> sample_id | gene_id []
> ---------------------------------------
> 1 | [1:30475]
> 2 | [1:29973]
> etc.
>

I'm not familiar with the square bracket and colon as a syntax for
expressing int arrays. Are you taking liberties with the psql output, or
using a different client program? Does that represent the range from 1 to
30475, or the two values 1 and 30475?

Cheers,

Jeff

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andrey Lizenko 2015-01-06 19:35:27 How to monitor locks (max_pred_locks_per_transaction)?
Previous Message Andy Colson 2015-01-06 18:35:45 Re: Correct/optimal DML query for application session management ?