Re: Re: Help!!! Trying to "SELECT" and get a tree structure back.

From: Mark Stosberg <mark(at)summersault(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: Re: Help!!! Trying to "SELECT" and get a tree structure back.
Date: 2001-08-16 20:34:41
Message-ID: 3B7C2E52.4F645776@summersault.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

--CELKO-- wrote:
>
> >> The table causing my headache:
>
> CREATE TABLE app_components
> (id NUMERIC(7) NOT NULL PRIMARY KEY,
> name VARCHAR(100) NOT NULL,
> description VARCHAR(500) NULL,
> parent_id NUMERIC(7) NULL
> REFERENCES app_components(id)
> ON DELETE CASCADE,
> CONSTRAINT appcomp_name_u UNIQUE (name, parent_id)); <<

I first tried the above approach to model trees in SQL, which also
caused me
headaches. The recursion needed to find all the ancestors for a given
id was slow. So I bought and looked through Joe Celko's book (who recently
posted on this topic). I implemented his ideas, and found that they were
better than the method above (and faster, as he says), but I still
wasn't satisfied. First, I didn't like that the notion wasn't easily
parsable for me. Updating and deleting categories felt like hacks, and
moving a category seemed like too much work. So I kept looking for new
ideas to model trees in SQL. On my third try, I found a solution I was
happy with, which I'll call the "sort key" method. I first read about it here:

http://philip.greenspun.com/wtr/dead-trees/53013.htm
(Search for "Sort keys deserve some discussion") on this page

The sort key is a single string that gives you the location of a node in
a tree.
Used in conjunction with a parent_id, I found that most of the questions
I was asking were easy to answer: Who is my parent? Who are all my
ancestors? Who are my immediate children? How many descendants do I
have? Who are siblings? Furthermore, it's fairly straightforward to
manipulate items using this structure, and queries are fast-- most
questions can answered with one SQL statement. Finally, the sort_keys
are fairly human parsable, which is nice. The trade-off for all these
features is that you have a fixed number of immediate children for any
parent (based on how many characters are used for each piece of the sort
key). I think in my application to categorize data, each parent can only
have 62 immediate children. I can live with that.

Cascade is a complete (free) Perl/Postgres application using this scheme
if you are interested in seeing these ideas in action. It's homepage is here:
http://summersault.com/software/cascade/

You'll be able to get a demo and source code from there.

Thanks,

-mark
http://mark.stosberg.com/

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Josh Berkus 2001-08-16 20:42:06 Re: Interval FAQ - please review
Previous Message Oleg Lebedev 2001-08-16 20:28:42 Nested JOINs