From: | Ron Johnson <ron(dot)l(dot)johnson(at)cox(dot)net> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Calculate Weighted Aggregation between Two Tables |
Date: | 2007-09-19 11:02:07 |
Message-ID: | 46F101AF.3090709@cox.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 09/19/07 04:13, Stefan Schwarzer wrote:
> Hi,
>
> I have global national statistics (GDP, Fish Catch etc.), which I would
> like to aggregate to and calculate per Capita data for given regions
> (Africa, Europe...) and subregions (Western Africa, ....) on-the-fly.
>
> From a statistical point of view it seems best to use something like this:
>
> given that the variable is "v" and the Total Population table is "tp":
>
> (country_1_of_region_1_for_v / country_1_of_region_1_for_tp)
> + (country_2_of_region_1_for_v / country_2_of_region_1_for_tp) + ....
>
> ----------------------------------------------------------------
>
> (sum_countries_all_of_region_1_for_v /
> sum_countries_all_of_region_1_for_tp)
>
>
> and then same thing for the other regions (in my case a total of 6
> regions).
>
>
> I have a table of countries with a field country_id and region_id (which
> specifies to which region this country belongs).
> I have the table for the variable with a field country_id, which has
> kind of a "Excel design", i.e. columns for each year.
Well that's a Very Bad Thing. (Wasn't there a thread about this
last week?)
Since the "years" are columns, are the "variables" in rows?
> And I have a table of regions with a field region_id.
>
> It seems like a rather complicated thing to do, as one needs first to
> get all the countries of a specific region for both tables (variable &
> pop_total), have the SUM(of all countries) available and run the
> mathematical procedure...
>
> Could someone give me a hint how this could be achieved?
This would be bog-simple if your tables were structured like this:
CREATE TABLE T_REGION (
REGION_ID CHAR(8) PRIMARY KEY,
DESCRIP TEXT);
CREATE TABLE T_COUNTRY (
COUNTRY_ID CHAR(8) PRIMARY KEY,
REGION_ID CHAR(8) FOREIGN KEY REFERENCES (T_REGION.REGION_ID),
DESCRIP TEXT);
CREATE TABLE T_NATIONAL_STATISTICS (
COUNTRY_ID CHAR(8),
STAT_YEAR SMALLINT CHECK VALUES BETWEEN(1900 AND 2020),
FORECAST BOOL,
GOVT_TYPE CHAR(4) FOREIGN KEY REFERENCES (T_GOVERNMENT.GOVT_TYPE),
POPULATION BIGINT,
GDP BIGINT,
FISH_CATCH BIGINT,
FISH_CONSUMED BIGINT,
WHEAT_HARVEST BIGINT,
WHEAT_CONSUMED BIGINT
CRUDE_PROD_BBL BIGINT, -- crude oil production,
CRUDE_CONSUM_BBL BIGINT, -- crude oil consumtion,
etc,
etc,
etc,
PRIMARY_KEY (COUNTRY_ID, STAT_YEAR));
(Syntax is a bit off, but you get the meaning...)
(I like human-readable keys.)
Whenever you find a new statistic you want to track, you add a new
column to T_NATIONAL_STATISTICS.
So, to (I think) get info on per-capita fish "usage" by year, by
country, for West Africa, you do:
SELECT NS.STAT_YEAR,
NS.COUNTRY_ID,
CAST(CAST(NS.FISH_CATCH AS DOUBLE)/NS.POPULATION
AS NUMERIC(8,2)) AS CATCH_PERCAP,
CAST(CAST(NS.FISH_CONSUMED AS DOUBLE)/NS.POPULATION
AS NUMERIC(8,2)) AS EATEN_PERCAP,
CAST(CAST(NS.FISH_CATCH AS DOUBLE)/NS.POPULATION) -
CAST(NS.FISH_CATCH AS DOUBLE)/NS.POPULATION)
AS NUMERIC(8,2)) AS FIST_SELF_SUFFICIENCY
FROM T_REGION R,
T_COUNTRY C,
T_NATIONAL_STATISTICS NS
WHERE R.REGION_ID = 'WESTAFRI'
AND C.REGION_ID = R.REGION_ID
AND NS.COUNTRY_ID = C.COUNTRY_ID;
You might want to left outer-join NS to C to indicate any west
African countries with missing statistics.
- --
Ron Johnson, Jr.
Jefferson LA USA
Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
iD8DBQFG8QGvS9HxQb37XmcRAprKAKCiDphSaLe1nTIf8tfaIwNCx/zZngCgpfeo
QF+ojV0/jYlLbLP+BDV86p4=
=7o+G
-----END PGP SIGNATURE-----
From | Date | Subject | |
---|---|---|---|
Next Message | A. Kretschmer | 2007-09-19 11:15:06 | Re: how to know the current size of a database |
Previous Message | son | 2007-09-19 10:36:02 | how to know the current size of a database |