Re: How hard would a "path" operator be to implement in PostgreSQL

From: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
To: David Johnston <polobo(at)yahoo(dot)com>
Cc: Chris Travers <chris(dot)travers(at)gmail(dot)com>, Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: How hard would a "path" operator be to implement in PostgreSQL
Date: 2012-08-20 03:50:51
Message-ID: 5031B41B.9010703@ringerc.id.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 08/20/2012 11:13 AM, David Johnston wrote:
> On Aug 19, 2012, at 21:28, Chris Travers <chris(dot)travers(at)gmail(dot)com> wrote:
>
>> Hi;
>>
>> I have been reading up on object-relational features of Oracle and DB2 and found that one of the big things they have that we don't is a path operator. The idea is that you can use the path operator to follow some subset of foreign keys called refs.
> Why do you feel this is a "big thing". Sure, you can save a few keystrokes when writing multi-table queries but that doesn't seem all that great and now you are introducing ambiguity into the planner/query when trying to resolve these implicit joins. I concur that introducing an explicit REF is a poor choice taken at face value since now you have to remember what references are present. With FOREIGN KEYS you introduce a logical constraint but you are able to perform an join between two tables independent of the presence of an FK.

These things mostly become valuable when used with more complex
expressions. They can also hide a *lot* of expensive work being done
behind the scenes, though, making seemingly simple queries extremely
expensive.

Think:

SELECT
parent->child11->child12 AS a,
parent->child21->child22 AS b,
FROM parent;

instead of:

SELECT
child12.a AS a,
child22.b AS b
FROM parent
LEFT OUTER JOIN child11 ON (child11.parent_id = parent.id)
LEFT OUTER JOIN child12 ON (child12.child11_id = child11.id)
LEFT OUTER JOIN child21 ON (child21.parent_id = parent.id)
LEFT OUTER JOIN child22 ON (child22.child21_id = child21.id);

It's basically the same inversion model used by the HQL query language
of Hibernate, or the similar JPQL of JPA. They're quick and convenient,
but can get insanely expensive to execute when implemented as join sets.

To do this well, the DB really needs a query optimiser that's smart
enough to know when it should be JOINing vs when it should be issuing
subqueries or even using path-operator-specific plans.

The main area *I* find path operators appealing is when combined with
features like `json`, so whole graphs can be fetched and returned in
single queries. For example something like:

SELECT row_to_json(ROW(
customer.*,
array_agg(customer->address) AS addresses,
array_agg(customer->contacts) AS contacts
)) AS json_result
FROM customer;

would potentially help get rid of a huge amount of the ugliness ORMs
currently do with de-duplicating results after doing huge left outer
join chains.

--
Craig Ringer

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Chris Travers 2012-08-20 03:52:07 Re: How hard would a "path" operator be to implement in PostgreSQL
Previous Message David Johnston 2012-08-20 03:13:01 Re: How hard would a "path" operator be to implement in PostgreSQL