Re: Calculate Weighted Aggregation between Two Tables

From: Ron Johnson <ron(dot)l(dot)johnson(at)cox(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Calculate Weighted Aggregation between Two Tables
Date: 2007-09-19 11:02:07
Message-ID: 46F101AF.3090709@cox.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 09/19/07 04:13, Stefan Schwarzer wrote:
> Hi,
>
> I have global national statistics (GDP, Fish Catch etc.), which I would
> like to aggregate to and calculate per Capita data for given regions
> (Africa, Europe...) and subregions (Western Africa, ....) on-the-fly.
>
> From a statistical point of view it seems best to use something like this:
>
> given that the variable is "v" and the Total Population table is "tp":
>
> (country_1_of_region_1_for_v / country_1_of_region_1_for_tp)
> + (country_2_of_region_1_for_v / country_2_of_region_1_for_tp) + ....
>
> ----------------------------------------------------------------
>
> (sum_countries_all_of_region_1_for_v /
> sum_countries_all_of_region_1_for_tp)
>
>
> and then same thing for the other regions (in my case a total of 6
> regions).
>
>
> I have a table of countries with a field country_id and region_id (which
> specifies to which region this country belongs).
> I have the table for the variable with a field country_id, which has
> kind of a "Excel design", i.e. columns for each year.

Well that's a Very Bad Thing. (Wasn't there a thread about this
last week?)

Since the "years" are columns, are the "variables" in rows?

> And I have a table of regions with a field region_id.
>
> It seems like a rather complicated thing to do, as one needs first to
> get all the countries of a specific region for both tables (variable &
> pop_total), have the SUM(of all countries) available and run the
> mathematical procedure...
>
> Could someone give me a hint how this could be achieved?

This would be bog-simple if your tables were structured like this:

CREATE TABLE T_REGION (
REGION_ID CHAR(8) PRIMARY KEY,
DESCRIP TEXT);

CREATE TABLE T_COUNTRY (
COUNTRY_ID CHAR(8) PRIMARY KEY,
REGION_ID CHAR(8) FOREIGN KEY REFERENCES (T_REGION.REGION_ID),
DESCRIP TEXT);

CREATE TABLE T_NATIONAL_STATISTICS (
COUNTRY_ID CHAR(8),
STAT_YEAR SMALLINT CHECK VALUES BETWEEN(1900 AND 2020),
FORECAST BOOL,
GOVT_TYPE CHAR(4) FOREIGN KEY REFERENCES (T_GOVERNMENT.GOVT_TYPE),
POPULATION BIGINT,
GDP BIGINT,
FISH_CATCH BIGINT,
FISH_CONSUMED BIGINT,
WHEAT_HARVEST BIGINT,
WHEAT_CONSUMED BIGINT
CRUDE_PROD_BBL BIGINT, -- crude oil production,
CRUDE_CONSUM_BBL BIGINT, -- crude oil consumtion,
etc,
etc,
etc,
PRIMARY_KEY (COUNTRY_ID, STAT_YEAR));

(Syntax is a bit off, but you get the meaning...)
(I like human-readable keys.)

Whenever you find a new statistic you want to track, you add a new
column to T_NATIONAL_STATISTICS.

So, to (I think) get info on per-capita fish "usage" by year, by
country, for West Africa, you do:

SELECT NS.STAT_YEAR,
NS.COUNTRY_ID,
CAST(CAST(NS.FISH_CATCH AS DOUBLE)/NS.POPULATION
AS NUMERIC(8,2)) AS CATCH_PERCAP,
CAST(CAST(NS.FISH_CONSUMED AS DOUBLE)/NS.POPULATION
AS NUMERIC(8,2)) AS EATEN_PERCAP,
CAST(CAST(NS.FISH_CATCH AS DOUBLE)/NS.POPULATION) -
CAST(NS.FISH_CATCH AS DOUBLE)/NS.POPULATION)
AS NUMERIC(8,2)) AS FIST_SELF_SUFFICIENCY
FROM T_REGION R,
T_COUNTRY C,
T_NATIONAL_STATISTICS NS
WHERE R.REGION_ID = 'WESTAFRI'
AND C.REGION_ID = R.REGION_ID
AND NS.COUNTRY_ID = C.COUNTRY_ID;

You might want to left outer-join NS to C to indicate any west
African countries with missing statistics.

- --
Ron Johnson, Jr.
Jefferson LA USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG8QGvS9HxQb37XmcRAprKAKCiDphSaLe1nTIf8tfaIwNCx/zZngCgpfeo
QF+ojV0/jYlLbLP+BDV86p4=
=7o+G
-----END PGP SIGNATURE-----

In response to

Browse pgsql-general by date

  From Date Subject
Next Message A. Kretschmer 2007-09-19 11:15:06 Re: how to know the current size of a database
Previous Message son 2007-09-19 10:36:02 how to know the current size of a database