Re: How to just get the last in a recursive query

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Steve Midgley <science(at)misuse(dot)org>
Cc: Shaozhong SHI <shishaozhong(at)gmail(dot)com>, Rob Sargent <robjsargent(at)gmail(dot)com>, pgsql-sql <pgsql-sql(at)lists(dot)postgresql(dot)org>
Subject: Re: How to just get the last in a recursive query
Date: 2022-04-05 00:50:09
Message-ID: CAKFQuwZR8bL7_CZDpmqqdX_T1fn-RvrEsw=egYdDVPL_BrUgSg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Mon, Apr 4, 2022 at 5:00 PM Steve Midgley <science(at)misuse(dot)org> wrote:

> On Mon, Apr 4, 2022 at 4:32 PM David G. Johnston <
> david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>
>> On Mon, Apr 4, 2022, 16:21 Shaozhong SHI <shishaozhong(at)gmail(dot)com> wrote:
>>
>>> That is not the most efficient in this case.
>>
>>
>> Can you prove that statement? Provide a query that is more efficient.
>>
>
> Just to share the SQL from that example
>
> WITH RECURSIVE walk_network(id, segment) AS (
> SELECT id, segment
> FROM network
> WHERE id = 6
> UNION ALL
> SELECT n.id, n.segment
> FROM network n, walk_network w
> WHERE ST_DWithin(
> ST_EndPoint(w.segment),
> ST_StartPoint(n.segment),0.01))SELECT idFROM walk_network
>
>
> David J (kind of off-topic): There's no *order by *in the original query,
> so I could imagine that adding any order by clause at all would make the
> query less efficient. But maybe it could become more efficient if the
> planner picks a better index as a result?
>
> David (OP): My main point is that in this example, since no order by
> clause is provided, it is meaningless to talk about a "last" or "first"
> item. SQL, afaik, is not required to produce the results in any order
> whatsoever, when no order by clause is provided (corrections welcome if
> that's not accurate). So while you might grab the last item somehow this
> time, it might not be the last item, the next time you run the query. So
> I'd say you should add an appropriate order by query, and then you can
> measure "ASC" vs "DESC" with "LIMIT 1" to see if either one is less
> efficient. (I'm in David J's camp that it's unlikely to make any difference)
>
>
Reading the query now...

I get the desire of the OP - walk the graph and just output the final node
that you fall upon. The nature of a recursive CTE is that it can/does have
a natural order if the graph produced is linear - thus it has a well
defined last node. The CTE, however, captures the entire path. The main
query, which only cares about the final node, then has to reverse the path
and then select the first node. That is the solution that was provided.

There is no way to get the final node in a path, giving a starting node,
without walking the path. Or rather, if there is for a given set of data,
a recursive CTE is not the best way to get at it. I'll admin the presence
of PostGIS makes this a bit harder for me to personally reason about, but
the concept is valid and turning the CTE into a subquery, reversing the
output (which should have an index indicating node order, in addition to
the node id), and doing limit 1 gives the desired answer.

Is there a better way to get that answer for this particular query - I have
no clue - but absent a better answer the OP needs to just take the one
suggestion that does provide a valid answer and assume it is the best
possible. Since no other suggestion exists the one answer is by definition
also the best answer.

David J.

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Tchouante, Merlin 2022-04-05 13:03:44 RE: How to just get the last in a recursive query
Previous Message Steve Midgley 2022-04-05 00:00:00 Re: How to just get the last in a recursive query