Re: Summing & Grouping in a Hierarchical Structure

From: Bryan L Nuse <nuse(at)uga(dot)edu>
To: Don Parris <parrisdc(at)gmail(dot)com>
Cc: "pgsql-sql(at)postgresql(dot)org" <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Summing & Grouping in a Hierarchical Structure
Date: 2013-02-23 04:54:05
Message-ID: 8A586FDE-CC0A-4F26-AD19-1F89C2D9349E@uga.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

This works fine:
test_ltree=> SELECT path, trans_amt FROM testcat;
path | trans_amt
-----------------------------------------+-----------
TOP.Transportation.Auto.Fuel | 50.00
TOP.Transportation.Auto.Maintenance | 30.00
TOP.Transportation.Auto.Fuel | 25.00
TOP.Transportation.Bicycle.Gear | 40.00
TOP.Transportation.Bicycle.Gear | 20.00
TOP.Transportation.Fares.Bus | 10.00
TOP.Transportation.Fares.Train | 5.00
TOP.Groceries.Food.Beverages | 30.00
TOP.Groceries.Food.Fruit_Veggies | 40.00
TOP.Groceries.Food.Meat_Fish | 80.00
TOP.Groceries.Food.Grains_Cereals | 30.00
TOP.Groceries.Beverages.Alcohol.Beer | 25.00
TOP.Groceries.Beverages.Alcohol.Spirits | 10.00
TOP.Groceries.Beverages.Alcohol.Wine | 50.00
TOP.Groceries.Beverages.Juice | 45.00
TOP.Groceries.Beverages.Other | 15.00
(16 rows)

So if I want to see:
TOP.Groceries | 240.00
TOP.Transportation | 180.00

Hello Don,

Perhaps I am missing something about what your constraints are, or what you're trying to achieve, but is there any reason you could not use a series of joined tables indicating parent-child relationships? The following example follows that in your previous posts. Note that this approach (as given) will not work if branches stemming from the same node are different lengths. That is, if you have costs associated with "Transportation.Bicycle.Gear", you could not also have a category "Transportation.Bicycle.Gear.Chain_ring". (To add the latter category, you'd have to put costs from the former under something like "Transportation.Bicycle.Gear.General" -- or modify the approach.) However, lengthening the "Alcohol" branches, e.g., by tacking on a level5 table would be easy. Notice that level3 and level4 are not true look-up tables, since they may contain duplicate cat values.

If I'm off base, by all means specify just how.

Regards,
Bryan

--------------------------------------------------

CREATE TABLE level1 (
cat text PRIMARY KEY
);

CREATE TABLE level2 (
cat text PRIMARY KEY,
parent text REFERENCES level1(cat)
);

CREATE TABLE level3 (
cat text,
parent text REFERENCES level2(cat),
cost numeric(6,2)
);

CREATE TABLE level4 (
cat text,
parent text,
cost numeric(6,2)
);

INSERT INTO level1
VALUES ('Transportation'),
('Groceries');

INSERT INTO level2
VALUES ('Auto', 'Transportation'),
('Bicycle', 'Transportation'),
('Fares', 'Transportation'),
('Food', 'Groceries'),
('Beverages', 'Groceries');

INSERT INTO level3
VALUES ('Fuel', 'Auto', 50.00),
('Maintenance', 'Auto', 30.00),
('Fuel', 'Auto', 25.00),
('Gear', 'Bicycle', 40.00),
('Gear', 'Bicycle', 20.00),
('Bus', 'Fares', 10.00),
('Train', 'Fares', 5.00),
('Beverages', 'Food', 30.00),
('Fruit_Veg', 'Food', 40.00),
('Meat_Fish', 'Food', 80.00),
('Grains_Cereals', 'Food', 30.00),
('Alcohol', 'Beverages', NULL),
('Juice', 'Beverages', 45.00),
('Other', 'Beverages', 15.00);

INSERT INTO level4
VALUES ('Beer', 'Alcohol', 25.00),
('Spirits', 'Alcohol', 10.00),
('Wine', 'Alcohol', 50.00);

CREATE VIEW all_cats AS (
SELECT a.cat AS level4,
b.cat AS level3,
c.cat AS level2,
d.cat AS level1,
CASE WHEN a.cost IS NULL THEN 0
WHEN a.cost IS NOT NULL THEN a.cost
END
+ CASE WHEN b.cost IS NULL THEN 0
WHEN b.cost IS NOT NULL THEN b.cost
END AS cost
FROM level4 a
FULL JOIN
level3 b
ON (a.parent = b.cat)
FULL JOIN
level2 c
ON (b.parent = c.cat)
FULL JOIN
level1 d
ON (c.parent = d.cat)
ORDER BY level1, level2, level3, level4
);

SELECT * FROM all_cats;

level4 | level3 | level2 | level1 | cost
---------+----------------+-----------+----------------+-------
Beer | Alcohol | Beverages | Groceries | 25.00
Spirits | Alcohol | Beverages | Groceries | 10.00
Wine | Alcohol | Beverages | Groceries | 50.00
| Juice | Beverages | Groceries | 45.00
| Other | Beverages | Groceries | 15.00
| Beverages | Food | Groceries | 30.00
| Fruit_Veg | Food | Groceries | 40.00
| Grains_Cereals | Food | Groceries | 30.00
| Meat_Fish | Food | Groceries | 80.00
| Fuel | Auto | Transportation | 50.00
| Fuel | Auto | Transportation | 25.00
| Maintenance | Auto | Transportation | 30.00
| Gear | Bicycle | Transportation | 20.00
| Gear | Bicycle | Transportation | 40.00
| Bus | Fares | Transportation | 10.00
| Train | Fares | Transportation | 5.00
(16 rows)

SELECT level1,
count(cost) AS num_branches,
sum(cost) AS total_cost
FROM all_cats
GROUP BY level1
ORDER BY level1;

level1 | num_branches | total_cost
----------------+--------------+------------
Groceries | 9 | 325.00
Transportation | 7 | 180.00
(2 rows)

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Ian Lawrence Barwick 2013-02-23 05:53:15 Re: Update HSTORE record and then delete if it is now empty - What is the correct sql?
Previous Message Don Parris 2013-02-23 02:52:18 Re: Summing & Grouping in a Hierarchical Structure