Re: Database/Table Design for Global Country Statistics

From: Tino Wildenhain <tino(at)wildenhain(dot)de>
To: Stefan Schwarzer <stefan(dot)schwarzer(at)grid(dot)unep(dot)ch>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Database/Table Design for Global Country Statistics
Date: 2007-09-12 09:16:12
Message-ID: 46E7AE5C.3030106@wildenhain.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Stefan Schwarzer schrieb:
> Hi there,
>
> I learned in another posting that my table design - in a polite way -
> "could be improved".
>
> So, before doing any additional design errors, I would like to get
> feedback, if possible.
>
> I am dealing with some 500 tables for worldwide national statistics
> (GDP, population, environment etc.), covering approx. 30 years each. For
> each of these variables, I usually have as well (pre-prepared)
> subregional and regional aggregations too. These could - and should - at
> the end be calculated on-the-fly, and not pre-calculated and imported
> from Excel as it is for the moment.
>
> My (national) table for a given variable is in the moment as follows (id
> being the identifier for a specific country):
>
> id | 1970 | 1971 | ... | 2004 | 2005
> -------------------------------------------------------------------
> 1 | NULL | 36 | ... | 42 | 45
> 2 ......
>
> The new design would be like this:
>
> id | year | value
> -------------------------------
> 1 | 1970 | NULL
> 1 | 1971 | 36
> 1 ....
> 1 | 2005 | 45
> 2 | 1970 | ....
> 2 .....
>
>
> Would that be considered as "good table design" then?

I'd not save null values but otherwise you are fine.
If you want to select over stable ranges of years
w/o bothering with non existent data (and therefore year)
it might be usefull to have a separate mapping table like
year -> year_id (sequential) and use the Ids for year.

This can even be simplified to a table just providing you the
coverage of years you have as a whole and use the number
directly as foreign key in your table.

table:years

year
1970
1971
1972
...

table:measures
measure_id,description
1 foovar
2 barvar
...

table:values
year (fk from years), measure_id (fk from measures), value
1971 1 36

every column of "values" table would be not null and
primary key (year,measure_id)

This way you can easily create any report you want
by selectiong a year range, then left join the values table
as often as you want for a given measure and so on.

Regards
Tino

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Albe Laurenz 2007-09-12 09:19:54 Re: Question to schema public
Previous Message Richard Huxton 2007-09-12 08:53:47 Re: Database/Table Design for Global Country Statistics