Re: SELECT all the rows where id is children of other node.

From: Rob Sargent <robjsargent(at)gmail(dot)com>
To: Francisco Olarte <folarte(at)peoplecall(dot)com>
Cc: pabloa98 <pabloa98(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: SELECT all the rows where id is children of other node.
Date: 2019-08-21 17:54:14
Message-ID: 1A6490A9-87AB-41E0-AAE1-74403C6E2131@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> On Aug 21, 2019, at 3:35 AM, Francisco Olarte <folarte(at)peoplecall(dot)com> wrote:
>
> Pablo:
>
> On Tue, Aug 20, 2019 at 6:49 PM pabloa98 <pabloa98(at)gmail(dot)com> wrote:
>> Thank you for your responses Rob. Appreciated. The problem with recursive queries is that they are executed several times and it has and impact in performance.
>> I need a subset of those rows and I want them in one pass.
>> I discovered that ltree extension could be useful. I will play with it today. I am sure there's a way to find al the nodes in O(n) time with n = size of the resulset ...
>
> Unless you have some extra conditions in a table ( like
> "autoincremented immutable primary key and parents are always created
> before childs" ) I think your problem of "get all the descendant ( i
> do not like to call them children ) nodes of a given id" can not be
> solved in one pass.
>
> I mean, if you are getting descendants of the node N1, you need to
> read the last node, NL, of the table to know if it is a child of N1.
> But then you have to read the table again to find childs of NL.
>
> Of course, if you have something like "hierarchical ids" you can
> traverse ordering by it and know NL MUST be childless, and build the
> tree rooted on node N1 as you go, but without some of this conditions
> I do not think it can be done in an "ideal" db ( which lets you scan
> in any order you can define from just a row without cost ) in one scan
> ( storing and prunning the whole tree as you go is cheating ).
>
> Also, if your node ids come from a serial and are immutables, or you
> take a little care when mutating them, you can do it traversing by id,
> but you need a full scan, a recursive query with several index scans
> may easily be faster in wide trees.
>
>
> Francisco Olarte.
If you accept Francisco’s thesis then you may be interested in this
with recursive descendants (last, children) as
(select c.c, array[null::int] from kids c where not exists (select 1 from kids p where c.c = p.p)
union all
select k.p, array[k.c] || l.children from kids k, descendants l where k.c = l.last)
select a.last, array_agg(distinct(a.kids))as clan from (select last, unnest(array_remove(children, null)) as kids from descendants where children[1] is not null) as a group by last order by last
last | clan
------+--------------------------
1 | {2,3,4,21,22,23,221,222}
2 | {21,22,23,221,222}
22 | {221,222}
(3 rows)
No comment on performance other than to say that if you are interested in the result for a given seed parent then performance would likely correlate with the average depth of your lineages.

I believe the ascending order of the members of each clan is completely fortuitous unless it’s a consequence of distinct?

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Michael Lewis 2019-08-21 18:08:27 Re: Complex filters -> Bad row estimates -> bad query plan
Previous Message Mathieu Fenniak 2019-08-21 15:53:22 Complex filters -> Bad row estimates -> bad query plan