Re: Trees: integer[] outperformed by ltree

From: Jan Walter <john(at)commontongue(dot)com>
To: postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Trees: integer[] outperformed by ltree
Date: 2013-11-05 21:52:32
Message-ID: 527968A0.6070106@commontongue.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 5.11.2013 20:51, Merlin Moncure wrote:
> On Tue, Nov 5, 2013 at 11:30 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>> On Tue, Nov 5, 2013 at 6:25 AM, Jan Walter <john(at)commontongue(dot)com> wrote:
>>> Hi,
>>>
>>> I am in a need of a very robust (esp. fast in read, non-blocking in update)
>>> tree structure storage (95% trees are of depth <4, current max. is 12). We
>>> have 10k-100k trees now, millions in the future.
>>> I made many tests, benchmarks of usual operations, and after all,
>>> materialized path ('1.5.3' path notation) seems most promising.
>> materialized path approaches tend to be ideal if the tree remains
>> relatively static and is not too deep. The downside with matpath is
>> that if a you have to move a node around in the tree, then all the
>> child elements paths' have to be expensively updated. I bring this up
>> as it relates to your 'non-blocking in update' requirement: in matpath
>> an update to parent can update an unbounded number of records.

Thanks for your remark.
Materialized path is still better in updates than nested sets we are
using currently.
Although adjacency lists with recursive CTEs were initially my favorite
substitution (smallest lock space for node relocation), whey are
completely failing in e.g. order by path (depth) task (150s vs. 31ms via
integer[]), and twice slower in simple descendant-based tasks.
I am yet considering it (if I moved e.g. ordering to application server
level), and still need to rewrite couple of more sophisticated scenarios
to CTEs to be absolutely sure if it fails; anyway MP seems more promising.
I also tend to have the tree structure completely independent on other
data belonging to nodes.

Or did you have any other model in your mind?

> hm, why do you need gist/gin for the int[] formulation? what are your
> lookup requirements? with int[] you can typically do contains with
> simple btree.

I do not think so, i.e. neither my tests nor
http://www.postgresql.org/docs/current/static/indexes-types.html showed
it works for <@ or @>. Using it for <= operator puts btree index to
query plan in some (not all) scenarios. Still it needs to be accompanied
with <@, and the performance goes in more complex scenarios down.

'Start with' I was mentioning, would be ideal.

Jan

P. S. Just to have a feeling, this is a simple overview of my current
benchmarks.

*scenario**
* *adjacency list* *nested set* *ltree path* *array path* *neo4j*
ancestors (42) 16ms 31ms 31ms 15ms 50ms/5ms
ancestors (1.000.000) 16ms 50ms 31ms 15ms
descendants (node 42) 180ms 90ms 90ms 140ms 2s
descendants (node 1) 4s 2s 2s 2s above all bounds
descendants 3 far 15ms 20s 31ms 65ms 50ms
order by path (depth) 150s 40ms 31ms 31ms
order by path (width)




children_above_fitting w/ tags 850ms 270ms 950ms
ids as descendants below ids 850ms 250ms 950ms

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Merlin Moncure 2013-11-05 22:19:26 Re: Trees: integer[] outperformed by ltree
Previous Message Josh Berkus 2013-11-05 19:59:57 Re: postgresql recommendation memory