Re: SQL Property Graph Queries (SQL/PGQ)

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Peter Eisentraut <peter(at)eisentraut(dot)org>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SQL Property Graph Queries (SQL/PGQ)
Date: 2024-08-05 12:41:20
Message-ID: CAExHW5tVY0fNpJ+x9TMD__ZJvqZ1V2t__y9SbR=Bhu8AoE8U_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 22, 2024 at 5:31 PM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>

I found that the patches do not support cyclic paths correctly. A
cyclic path pattern is a path patterns where an element pattern
variable repeats e.g. (a)->(b)->(a). In such a path pattern the
element patterns with the same variable indicate the same element in
the path. In the given example (a) specifies that the path should
start and end with the same vertex. Patch 0006 supports cyclic path
patterns.

Elements which share the variable name should have the same element
type. The element patterns sharing the same variable name should have
same label expression. They may be constrained by different conditions
which are finally ANDed since they all represent the same element. The
patch creates a separate abstraction "path_factor" which combines all
the GraphElementPatterns into one element pattern. SQL/PGQ standard
uses path_factor for such an entity, so I chose that as the structure
name. But suggestions are welcome.

A path_factor is further expanded into a list of path_element objects
each representing a vertex or edge table that satisfies the label
expression in GraphElementPattern. In the previous patch set, the
consecutive elements were considered to be connected to each other.
Cyclic paths change that. For example, in path pattern (a)->(b)->(a),
(b) is connected to the first element on both sides (forming a cycle)
instead of first and third element. Patch 0006 has code changes to
appropriately link the elements. As a side effect, I have eliminated
the confusion between variables with name gep and gpe.

While it's easy to imagine a repeated vertex pattern, a repeated edge
pattern is slightly complex. An edge connects only two vertices, and
thus a repeated edge pattern constrains the adjacent vertex patterns
even if they have different variable names. Such patterns are not
supported. E.g. (a)-[b]->(c)-[b]->(d) would mean that (d) and (a)
represent the same vertex even if the variable names are different.
Such patterns are not supported for now. But (a)-[b]->(a)-[b]->(a) OR
(a)-[b]->(c)<-[b]-(a) are supported since the vertices adjacent to
repeated edges are constrained by the variable name anyway.

The patch also changes many foreach() to use foreach_* macros as appropriate.

> 0001 - same as previous one
> 0002 - fixes pgperltidy complaints
> 0003 - fixes compilation failure
> 0004 - fixes issue seen on CI
> 0005 - adds support for WHERE clause in graph pattern missing in the
> first patch.
0006 - adds full support for cyclic path patterns

Once reviewed, patches 0002 to 0006 should be merged into 0001.

--
Best Wishes,
Ashutosh Bapat

Attachment Content-Type Size
0004-Fix-spurious-column-not-found-error-20240805.patch text/x-patch 1.0 KB
0003-Fix-compilation-error-20240805.patch text/x-patch 950 bytes
0005-support-WHERE-clause-in-graph-pattern-20240805.patch text/x-patch 7.2 KB
0002-pgperltidy-fixes-20240805.patch text/x-patch 1.2 KB
0001-WIP-SQL-Property-Graph-Queries-SQL-PGQ-20240805.patch text/x-patch 502.8 KB
0006-Support-cyclic-path-pattern-20240805.patch text/x-patch 37.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2024-08-05 12:43:26 BlastRADIUS mitigation
Previous Message Joe Conway 2024-08-05 12:38:00 Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?