Re: Is EXISTS the most efficient approach for PostgreSql to check for existence of nodes in a tree?

From: Rob Sargentg <robjsargent(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Is EXISTS the most efficient approach for PostgreSql to check for existence of nodes in a tree?
Date: 2012-05-17 14:19:38
Message-ID: 4FB508FA.40009@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 05/17/2012 03:06 AM, Seref Arikan wrote:
> I have a key value table in my Postgresql db, which represents
> hierarchical data through parent_feature_mapping column that points to
> id of feature_mapping_id column of the same table.
>
> I need to select root nodes that has children which satisfy various
> conditions. The conditions may extend to children of children, so I'm
> trying to find roots of trees that contain paths that satisfy the
> given constraints.
>
> An example is finding the trees where the root node has type
> 'COMPOSITION' and root node's archetypeNodeId attribute has value
> 'openEHR-EHR-COMPOSITION.discharge.v1' another constraint is root node
> having a child of type 'CONTENTITEM' that in turn has a child of type
> 'ITEMSTRUCTURE'
> All nodes in a tree have the same payload Id. The fastest query that I
> could write so far is given below.
>
>
> SELECT root.id <http://root.id> from path_value as root
> WHERE
> root.rm_type_name = 'COMPOSITION'
> AND
> root.feature_name = 'composition'
> AND
> EXISTS (SELECT 1 from path_value as anodeid
> WHERE
> anodeId.parent_feature_mapping_id =
> root.feature_mapping_id
> AND
> anodeId.payload_id = root.payload_id
> AND
> anodeId.feature_name = 'archetypeNodeId'
> AND
> anodeId.val_string =
> 'openEHR-EHR-COMPOSITION.discharge.v1'
> LIMIT 1
> )
>
> AND
> EXISTS (SELECT 1 from path_value as node1
> WHERE
> node1.payload_id = root.payload_id
> AND
> node1.parent_feature_mapping_id = root.feature_mapping_id
> AND
> node1.feature_name = 'content'
> AND
> node1.rm_type_name = 'CONTENTITEM'
> AND
> EXISTS (SELECT 1 from path_value as node2
> WHERE
> node2.payload_id = node1.payload_id
> AND
> node2.parent_feature_mapping_id =
> node1.feature_mapping_id
> AND
> node2.rm_type_name = 'ITEMSTRUCTURE'
> LIMIT 1)
> LIMIT 1)
>
> My question is: is this the best approach in terms of performance?
> This is an attempt to identify XML payloads that fit certain criteria.
> I have also considered using an ltree column that will contain the
> tree in a from that I can query as an alternative to sql based method,
> or I can use xpath queries on XML payload.
>
> The create statement for my table is as follows:
>
> CREATE TABLE public.path_value (
> val_string TEXT,
> feature_mapping_id INTEGER NOT NULL,
> parent_feature_mapping_id INTEGER,
> feature_name TEXT,
> rm_type_name TEXT,
> path INTEGER NOT NULL,
> payload_id INTEGER NOT NULL,
> id INTEGER NOT NULL,
> ehr_id INTEGER,
> CONSTRAINT path_value_pkey PRIMARY KEY(id)
> ) WITHOUT OIDS;
>
>
> Best regards
> Seref
>
Any other constraints or indexes on that table?

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Paulo Correia 2012-05-17 15:10:17 Re: Postgres 9.0 Streaming Replication and Load Balancing?
Previous Message Seref Arikan 2012-05-17 09:06:26 Is EXISTS the most efficient approach for PostgreSql to check for existence of nodes in a tree?