Re: Concatenate of values in hierarchical data

From: Alban Hertroys <haramrae(at)gmail(dot)com>
To: "Mr(dot) Baseball 34" <mrbaseball34(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Concatenate of values in hierarchical data
Date: 2018-04-04 09:43:11
Message-ID: 4945D518-FC1C-49AF-B773-7F4F7FC18E3E@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


> On 2 Apr 2018, at 19:23, Mr. Baseball 34 <mrbaseball34(at)gmail(dot)com> wrote:
>
> I have the data below, returned from a PostgreSQL table using this SQL:
>
> SELECT ila.treelevel,
> ila.app,
> ila.lrflag,
> ila.ic,
> ila.price,
> ila.treelevel-1 as parent,
> ila.seq
> FROM indexlistapp ila
> WHERE ila.indexlistid IN (SELECT il.indexlistid
> FROM indexlist il
> WHERE il.model = '$model'
> AND ('$year' BETWEEN il.beginyear AND il.endyear)
> AND il.type = '$part')
> ORDER BY ila.seq;
>
> Data Returned
>
> level app lrflag ic price parent seq indexlistid
> ---------------------------------------------------------------------------------
> 1, 'Front', null, null, null, 0, 27, 439755
> 2, 'V-Series' null, null, null, 1, 28, 439755
> 3, 'opt J56', null, null, null, 2, 29, 439755
> 4, 'R.', 'R', '536-01132AR','693.00', 3, 30, 439755
> 4, 'L.', 'L', '536-01133AL','693.00', 3, 31, 439755
> 3, 'opt J63', null, null, null, 2, 32, 439755
> 4, 'R.', 'R', '536-01130R', null, 3, 33, 439755
> 4, 'L.', 'L', '536-01131L', null, 3, 34, 439755
> 2, 'exc. V-Series', null, null, null, 1, 35, 439755
> 3, 'opt JE5', null, null, null, 2, 36, 439755
> 4, 'AWD', null, null, null, 3, 37, 439755
> 5, 'R.', null, '536-01142', null, 4, 38, 439755
> 5, 'L.', null, '536-01143', null, 4, 39, 439755
> 4, 'RWD', null, null, null, 3, 40, 439755
> 5, 'R.', null, '536-01143', null, 4, 41, 439755
> 5, 'L.', null, '536-01142', null, 4, 42, 439755
> 3, 'opt J55', null, null, null, 2, 43, 439755
> 4, 'AWD', null, null, null, 3, 44, 439755
> 5, 'R.', null, '536-01036', null, 4, 45, 439755
> 5, 'L.', null, '536-01037', null, 4, 46, 439755
> 4, 'RWD', null, null, null, 3, 47, 439755
> 5, 'R.', null, '536-01037', null, 4, 48, 439755
> 5, 'L.', null, '536-01036', null, 4, 49, 439755
> 1, 'Rear', null, null, null, 0, 260, 439765
> 2, 'Base', null, null, null, 1, 261, 439765
> 3, 'opt JE5', null, null, null, 2, 262, 439765
> 4, 'R.', 'R', '536-01038R', null, 3, 263, 439765
> 4, 'L.', 'L', '536-01039L', null, 3, 264, 439765
> 3, 'opt J55', null, null, null, 2, 265, 439765
> 4, 'R.', 'R', '536-01042R', null, 3, 266, 439765
> 4, 'L.', 'L', '536-01043L', null, 3, 267, 439765
> 2, 'V-Series', null, null, null, 1, 268, 439765
> 3, 'R.', 'R', '536-01134AR', '403.00', 2, 269, 439765
> 3, 'L.', 'L', '536-01135AL', '466.00', 2, 270, 439765
>
> matching data from indexlist
>
> model type beginyear endyear indexlistid
> ---------------------------------------------
> 'CTS', '536', 2009, 2010, 439755
> 'CTS', '536', 2009, 2010, 439765
>
> There are primary keys on indexlist (on indexlistid) and indexlistapp (on indexlistid) but there is no foreign key pointing to the other table. The indexlistid in indexlist
> points directly to the indexlistid in indexlistapp. The parent column is simply calculated from the treelevel. The tree is built entirely from the seq and treelevel.
>
> I need the data to be returned in this format:
>
> app price ic
> ---------------------------------------------------------------
> 'Front-V-Series-opt J56-R.', '$693', '536-01132AR'
> 'Front-V-Series-opt J56-L.', '$693', '536-01132AL'
> 'Front-V-Series-opt J63-R.', null, '536-01130R'
> 'Front-V-Series-opt J63-L.', null, '536-01131L'
> 'Front-exc. V-Series-opt JE5-AWD-R.', null, '536-01142'
> 'Front-exc. V-Series-opt JE5-AWD-L.', null, '536-01143'
> 'Front-exc. V-Series-opt JE5-RWD-R.', null, '536-01143'
> 'Front-exc. V-Series-opt JE5-RWD-L.', null, '536-01142'
> 'Front-exc. V-Series-opt J55-AWD-R.', null, '536-01136'
> 'Front-exc. V-Series-opt J55-AWD-L.', null, '536-01137'
> 'Front-exc. V-Series-opt J55-RWD-R.', null, '536-01137'
> 'Front-exc. V-Series-opt J55-RWD-L.', null, '536-01136'
> 'Rear-Base-opt JE5-R.', null, '536-01038R'
> 'Rear-Base-opt JE5-L.', null, '536-01039L'
> 'Rear-Base-opt J55-R.', null, '536-01042R'
> 'Rear-Base-opt J55-L.', null, '536-01043L'
> 'Rear-V-Series-R.', '$403.00', '536-01134AR'
> 'Rear-V-Series-L.', '$466.00', '536-01135AL'
>
> I am unsure how to do this in SQL.

(…)

> Also, using this recursive sql (REF:https://stackoverflow.com/questions/26280379/how-to-concatenate-field-values-with-recursive-query-in-postgresql) I'm able to get it fairly close.
> Just not sure why it is returning 476 rows vs. 34. http://sqlfiddle.com/#!15/ca1ee/3
>
> WITH RECURSIVE the_tree AS (
>
> SELECT g.seq, g.app::TEXT, g.price, g.ic, g.treelevel::INTEGER
> FROM indexlistapp g
> WHERE g.indexlistid in (SELECT il.indexlistid
> FROM indexlist il
> WHERE il.model = 'CTS'
> AND ('2010' BETWEEN il.beginyear AND il.endyear)
> AND il.type = '536')
> AND g.treelevel = 1
>
>
> UNION
>
> SELECT t.seq, t.app || ', ' || g.app AS app, t.price, t.ic, t.treelevel::INTEGER + 1
> FROM the_tree AS t
> INNER JOIN indexlistapp g ON g.treelevel = t.treelevel + 1
> WHERE g.indexlistid in (SELECT il.indexlistid
> FROM indexlist il
> WHERE il.model = 'CTS'
> AND ('2010' BETWEEN il.beginyear AND il.endyear)
> AND il.type = '536')
> )
> SELECT * from the_tree;
>
>
> Can anyone help me out here?

Your JOIN in the recursive part of the CTE should probably be ON g.parent = t.treelevel AND g.indexlistid = t.indexlistid. Since both the initial and recursive part use the same value for indexlistid, the subquery in the recursive part is superfluous.

You should also take the order in seq into account for items on the same node, if that is possible in your data, otherwise there is no guarantee that you'll get your string components in the correct order.

I think you need something closer to this:

WITH RECURSIVE the_tree AS (
SELECT g.seq, g.app::TEXT, g.price, g.ic, g.treelevel::INTEGER, g.indexlistid
FROM indexlistapp g
WHERE g.indexlistid IN (
SELECT il.indexlistid
FROM indexlist il
WHERE il.model = 'CTS'
AND '2010' BETWEEN il.beginyear AND il.endyear
AND il.type = '536'
)
AND g.parent = 0

UNION ALL

SELECT t.seq, t.app || ', ' || g.app AS app, t.price, t.ic, g.treelevel, g.indexlistid
FROM the_tree t
INNER JOIN indexlistapp g ON g.parent = t.treelevel AND g.indexlistid = t.indexlistid
ORDER BY g.seq
)
SELECT * FROM the_tree

Another observation is that you only seem interested in the leaf nodes of your tree. Recognising leaf nodes in your hierarchical query can be achieved by adding a column that tests the existence of a node with the current node as its parent - if it does not exist, this node is a leaf node.

The result of SELECT * FROM the_tree can then be filtered to return leaf nodes only.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Jerry Sievers 2018-04-04 22:29:46 PgUpgrade bumped my XIDs by ~50M?
Previous Message Simon Riggs 2018-04-04 07:03:16 Re: [PATCH] Logical decoding of TRUNCATE