Re: Database structure

From: Svenne Krap <svenne(at)krap(dot)dk>
To: Alain Reymond <alain(dot)reymond(at)ceia(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Database structure
Date: 2004-05-04 16:10:13
Message-ID: 4097C065.7050004@krap.dk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

I would definately say solution two.

As you point out yourself, there are only for int4s (propably even
int2s), that is 8 bytes each for the int4 (if I remeber corretly), which
equals something in the 40-50 bytes range for the row w/o index.
For 15m rows, thats not much more than 750 megabytes without the
indexes, which I believe take up roughly the same amount of space.

That might be around 1.5 GB of data, which I still consider a reasonably
sized database.
What I work on daily is in the 1 GB range already and grows 25-30
megabytes/day and we use large (3000 bytes or longer), complex (with
more than 10 subselects and utilizing inner joins, outer joins, cross
joins) queries, who are returning around 3000 rows each. This runs in a
matter of 2 minutes on a single cpu 2ghz system with ide disk (no raid)
and only half a gig of memory.

A good starting point for help would be data sizes of each tables (in my
opinion the number of digits is usually close enough), complexity of
querys (how many tables, subselects, types of joins,uses of aggregates
and so on) and finally what is considered "fast enough" - for a website
2 seconds may be fast enough, for a croned job once a month, the same
might be true for 3 days.

In the long run, being correct is usually better than being fast (at the
point of the implementation), as new hardware easily solves bottlenecks
for problems not scaling exponentially.

Svenne

Alain Reymond wrote:

>Hello,
>
>I would like an advise on the following problem :
>
>I have a table of patients.
>Each patient can make different biological assessments.
>Each assessment is always decomposed into different laboratory tests.
>A laboratory test is made of a test number and two values coming from analysers.
>
>The schema is :
>Patients(#patient_nr,name,etc...)
>Assessment(#assessment_nr, #patient_nr, assessment_type, other usefull values).
>Assessment_types(assessment_type, labtest_nr)
>An assessment is composed of different tests, let's say assessment type 1 is
>composed of lab test 1,2,3,5 and assessment type 2 of lab test number 10 to 70.
>
>I have an assessment with 60 different lab tests (always the same). I have two ways
>for storing the values :
>
>1 - a table with 120 columns for the two values.
>results(#assessment_nr, p10,d10, p11,d11, .....,p70,d70).
>where 10 to 70 represents the lab test number.
>
>2 - a table with 60 rows for one assessment :
>results(#assessment_nr, labtest_nr, p, d) where p and d are my two results.
>
>Here comes my question. Which of the two would you choose?
>
>The firsrt solution has the advantage of returning one single row for one complete
>assessment. If I have to make statistics, it is easy. But, if I have to modify the
>composition of an assessment (which occurs very rarely), I shall have to use an alter
>table instruction. As I have 4 different assessment types, I have to create five
>different tables, one per assessment type.
>
>The second solution is normalized and more elegant. But I am preoccupied by the
>size of the table. For one assessment, I'll store 60 rows with only two useful integers
>in it. And you must add the size of the index. With 25.000 assessments a year, it
>makes 1.500.000 rows with only 4 columns amoung them 2 only for the results and 2
>for identification. I would like to store 10 years online, so 15.000.000 rows. What
>about the size of index ?
>
>Any advise ? I thank you in advance.
>
>
>Alain Reymond
>
>(I hope that it is clear enough with my bad English).
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 7: don't forget to increase your free space map settings
>
>
>

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Bruno Wolff III 2004-05-04 16:25:39 Re: Database structure
Previous Message Alain Reymond 2004-05-04 15:45:44 Re: Database structure