Re: recursive WITH nested union ALL with NOCYCLE logic

From: Michael Moore <michaeljmoore(at)gmail(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: postgres list <pgsql-sql(at)postgresql(dot)org>
Subject: Re: recursive WITH nested union ALL with NOCYCLE logic
Date: 2016-03-21 18:38:47
Message-ID: CACpWLjPpswBj1Bw6QKm8wQgzVuJRUfi7uqKnMoB4jHTJ4u2E6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Fri, Mar 18, 2016 at 6:03 PM, David G. Johnston <
david(dot)g(dot)johnston(at)gmail(dot)com> wrote:

> On Fri, Mar 18, 2016 at 4:11 PM, Michael Moore <michaeljmoore(at)gmail(dot)com>
> wrote:
>
>> David,
>> If I am understanding you correctly, you are assuming that alternative
>> hierarchies are mapped to only ROOT level hierarchies. It's a reasonable
>> assumption on your part given my illustration only covered this use case.
>> But lets add another mapping.
>>
>> insert into mike_map (key,child,parent) values
>> ('555','kkk','bbb');
>> This creates an alternative hierarchy that branches from a CHILD (bbb) of
>> the ROOT (aaa) hierarchy.
>> So I think I still need the UNION ALL inside the recursive part.
>>
>>
> ​You are correct about my assumed premise - and I'm not sure I want to
> deal with the brain damage that would result from exploring the model you
> are working with...
>
> ​If you can change how you model "mike_map" this should be workable.
>
> The following provides the desired result for the original data as well
> as, I think, the correct result when adding you kkk/bbb alias.
>
> I think this gets you closer..but probably not all of the way to your goal.
>
> My specific concern is that none of these queries (including your
> original) have considered "ddd,cow,null" as a valid result...because it
> lacks a parent.
>
> I'll leave to you to copy-paste and make it readable...
>
> ​WITH recursive
> mike_map_alt (parent, children) AS (
> --using an array here is probably not mandatory but makes the logic below
> simpler by use of unnest()
> VALUES ('aaa',ARRAY['ddd','eee']::text[]), ('bbb',ARRAY['kkk']::text[])
> ), mike_hier_aliased AS (
> --here we "canonical-ize" the data so that for each row we know which
> direct aliases are in play
> SELECT mh.key::text AS key, mh.val, mh.parent::text AS parent,
> mh.key::text || COALESCE(mp.children, ARRAY[]::text[]) AS key_aliases,
> mh.parent::text || COALESCE(mk.children, ARRAY[]::text[]) AS parent_aliases
> FROM mike_hier mh
> LEFT JOIN mike_map_alt mp ON (mh.key = mp.parent)
> LEFT JOIN mike_map_alt mk ON (mh.parent = mk.parent)
> )
> --SELECT * FROM mike_hier_aliased --used for debugging the transformed
> data
>
> , recursive_part (my_key, my_val, my_parent) AS (
>
> -- unnest the for the original row output the value; we start at the root
> so parent should be null for all records
> -- for aliases we don't know what their root entry is yet so simply record
> null for value
> -- and let the first iteration pick up the real record
>
> -- this could be altered to left join the mike_heir_aliased table records
> containing null parents
> -- and coalesce the joined "val" column. This would cause "cow" from the
> "ddd' record to be added to the
> -- initial "ddd" unnested alias record.
> SELECT me_key, CASE WHEN key = me_key THEN val ELSE null::text END AS
> my_val, null::text AS my_parent FROM (
> SELECT *, unnest(key_aliases) AS me_key FROM mike_hier_aliased WHERE
> mike_hier_aliased.key = 'aaa'
> ) src
>
> -- on each iteration we again have to look for aliases and create
> -- dummy records for them (via unnest) for the next iteration
> UNION --let the recursive part take care of duplicates
> SELECT me_key,
> CASE WHEN key = me_key THEN val ELSE null::text END AS my_val,
> CASE WHEN key = me_key THEN parent ELSE null::text END AS my_parent
> FROM (
> SELECT mha.key, mha.val, mha.parent, unnest(mha.key_aliases) AS me_key
> FROM mike_hier_aliased mha
> JOIN recursive_part rp
> ON (rp.my_key = mha.parent) --fails to match mha.(ddd,cow,null) though
> rp.(ddd,null,null) is present
> ) rec_src
> )
>
> SELECT * FROM recursive_part WHERE my_val IS NOT NULL
> ​--our dummy records all have null for my_val so lets remove them
> ​
> David J.
>
> Here is the formatted version of the query you wrote:
with
recursive
mike_map_alt (parent, children) AS (
VALUES
('aaa', ARRAY['ddd','eee']::text[]),
('bbb', ARRAY['kkk']::text[])
),
mike_hier_aliased AS (
SELECT mh.key::text AS key,
mh.val,
mh.parent::text AS parent,
mh.key::text || COALESCE(mp.children, ARRAY[]::text[]) AS
key_aliases,
mh.parent::text || COALESCE(mk.children, ARRAY[]::text[]) AS
parent_aliases
FROM mike_hier mh
LEFT JOIN mike_map_alt mp ON (mh.key = mp.parent)
LEFT JOIN mike_map_alt mk ON (mh.parent = mk.parent)
)
SELECT * FROM mike_hier_aliased;

and you said:
*"My specific concern is that none of these queries (including your
original) have considered "ddd,cow,null" as a valid result...because it
lacks a parent."*
I think this concern is the result of me not explaining the use case for
"alternative hierarchies" sufficiently (or at all). The word "alternative"
is misleading. "extension" would have been more appropriate. In other
words, we want to *extend* the target hierarchy (the hierarchy which is the
subject of the query) with the children of the alternative hierarchy. In
my example, we could say; hierarchy aaa is to be extended with the children
of hierarchies ddd and eee. Hence "ddd,cow,null" in not interesting in the
context of a search on aaa.

Another thing I didn't mention, because I did not anticipate it's
importance, is the number of rows on the REAL tables I and dealing with.
The hierarchy table has 200,000 rows and the mapping table 4,000 rows.
The average query will return about 20 rows. So, for this reason it's not
going to be practical to "pre-join" all of the rows; not that you were
advocating this, just thought I should get it out there.

I've learned a lot from looking at what you have provided so far.
In theory there should be no recursive loops in the data. I just thought
I'd put the NO-CYCLE logic in just to be safe. The point is, I can go with
my existing solution so this isn't urgent, it's more of a puzzle; a "is it
possible to do this" sort of thing. *Thanks again for your time and help*.
Mike

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Michael Moore 2016-03-23 00:23:50 simple function index question
Previous Message Adrian Klaver 2016-03-21 16:04:31 Re: plan not correct?