From: | "Mikito Harakiri" <mikharakiri(at)iahu(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Making a tree with "millions and millions" of dynamic nodes |
Date: | 2003-12-03 02:51:07 |
Message-ID: | 3kczb.23$2A3.117@news.oracle.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
"Christian Fowler" <google(at)NOSPAM(dot)gravesweeper(dot)com> wrote in message
news:6b-dnQJnDI6YvlCiXTWc-g(at)speakeasy(dot)net(dot)(dot)(dot)
> For selection, it is crucial for me to get:
>
> 1. path generation speed
Path to the root? It is coded explicitly in the materialized path. All you
need to do is parsing the path and generating a list of prefixes that you
use within your query like this:
select * from hierarchy where path in ('1','1.2','1.2.1', '1.2.1.1')
If you have an index, and your rdbms optimiser is smart enough, the query
should execute fast.
> 2. immediate sibling speed
Here is the kludge:
select * from hierarchy where path in ('1.2.1.1','1.2.1.2','1.2.1.3',
'1.2.1.4','1.2.1.5')
Again, I assume that your query would use an index here.
If you get exactly 5 records, then you can't be sure that your node has that
many children. You have to repeat query like this:
select * from hierarchy where path in (
'1.2.1.1','1.2.1.2','1.2.1.3', '1.2.1.4','1.2.1.5'
,'1.2.1.6','1.2.1.7','1.2.1.8', '1.2.1.9','1.2.1.10')
,'1.2.1.11','1.2.1.12','1.2.1.13', '1.2.1.4','1.2.1.15')
,'1.2.1.16','1.2.1.17','1.2.1.18', '1.2.1.19','1.2.1.20')
,'1.2.1.21','1.2.1.22','1.2.1.23', '1.2.1.4','1.2.1.25')
Yet again, there might be more than 25 children, so you'll have to run yet
again more "generos" query.
The pitfall here is that at some point the optimiser may get tired to do
"or" expansion, so that at some point you might end up with full table scan.
But, this is implementation dependent, so that you might be able to
influence optimizer decision.
As you see, I'm not worried how many bytes materialized path has, or if
parsing it takes more time than multiplying 2 integers. My concern is if
your query can leverage index or not.
> 3. children count speed
Children, or descendants? I guess no method can beat Celko's descendant's
formula
#descendants=(rgt-lft+1)/2
The count is implicitly stored at the root node, so that even for hierarchy
1M nodes we answer how many nodes are there instantly. This is also an
Achiles' heel of the method: any update to the hierarchy triggers refresh
of the "counter". One aggregate is never enough, though: it's often useful
to know total salary too:-)
If you meant not descendants, but children, then use a method similar to
bullet #2.
From | Date | Subject | |
---|---|---|---|
Next Message | Lamar Owen | 2003-12-03 04:09:14 | Re: perl(Pg) (S)RPM |
Previous Message | jini us | 2003-12-03 01:23:11 | Re: embedded postgresql + C++ IDE |