Re: How to represent a bi-directional list in db?

From: Francisco Olarte <folarte(at)peoplecall(dot)com>
To: Pankaj Jangid <pankaj(dot)jangid(at)gmail(dot)com>
Cc: "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: How to represent a bi-directional list in db?
Date: 2019-09-23 08:54:48
Message-ID: CA+bJJbz-UABvh-eygr8ic6Np22Q+JFEeyPZ+j+sCZB8NOAHbsw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Pankaj:

On Mon, Sep 23, 2019 at 4:07 AM Pankaj Jangid <pankaj(dot)jangid(at)gmail(dot)com> wrote:
> Thanks. This resolved my problem of NULL/NOT NULL conflict. I wasn't
> aware that SERIAL is by default NOT NULL.

Not only that. Once you strip the annoying NOT NULL the only thing
remaining on a serial column is a "default nextval", which you
normally do not want ( you could think of populating the table in
creative ways, but they are on a different sequence than the one you
use for the ID column ).

> > Also, you may have problems populating this kind of table, as you will
> > not have the ids from either prev or next stage when building it.
> If NULL value is allowed I can fill it up with NULL initially. Right? Or
> is there something wrong here.

There is not, you can use (id,prev,next) = (1,null,null) and then
update, but you are going to need to travel up and down a lot, or
store a lot of data. If you use the trick I comment later of just
using "prev", you can do, on a table having (id=serial, prev=int),
build a sequence by doing "prev_id=null"; insert (id,prev,other_data)
returning id; copy return value to prev_id, rinse and repeat.

Also note that you can query the sequence AND advance it and then
insert all rows without default values.

> > And lastly, in SQL you do not really need a doubly linked list, just
> > populate prev_stage_id, and index it and you can query next stage of a
> > tuple using it.
> Could you please elaborate? Suppose I have this table,
> CREATE TABLE stages (
> id SERIAL PRIMARY KEY,
> name VARCHAR(80) NOT NULL,
> next_id INTEGER REFERENCE stages NULL,
> );
> What would be the backward query in that case? Forward is clear. This is
> forward query,
> SELECT name FROM stages WHERE next_id = 123;

No. That is a BACKWARDS QUERY. You are in row 123, you go BACK to its
preceedeing one.
If you need a traversable list containing (ListID, id,name) = x,1,A;
x,2,b; x,3;c ( I've added the ListId column to make it more
interesting/reallistic, you normally do not have a single table)
In sql you can build a (ListId, id, prev_id, name ) table ( PREV is
easier, as when you insert a row, in a normal application, you know
the previous one, but not the next one ) with the data
(x,1,null,a),(x,2,1,b),(x,3,2,c) ( the last one is a synthetic
sentinel and assumes nullable id), you can do it in a lot of ways.

To traverse it forward you just querying "select id where listid=x and
next_id is null" to locate the head (1), and then just go forward by
selecting with prev_id = last got id until you hit zero results.

To traverse backwards there are several ways. In the real cases I've
used I always had a "list row" where I could store the node for the
1st stage. In that cases i linked them circularly, (id=1, prev=3), so
bidirectional traversing was easy. Or you can use a special sentinel
node ( with a marker, like name=null). The thing is you locate the
last row, and then just query with id=last got prev_id. I do not
remember the details, but probably your "stages" are stages of
something which has a row, which can readily have a "first_stage_id"
or something similar.

Lists in tables are not the same as in C, where you directly store
pointers which point outwards. In this case any unique data serves as
a pointer, slow ( table scan ) by default, faster if you index the
column.

Anyway, unless you need the "linked list" functionality for something
( really heavy manipulation of large stage lists, splicing things
around ), I've normally found it's easier, in sql, to model this kind
of thing with a master-detail + order column.
( whatever = (id, ...., first_stage_id), stages=(id, order, .... ) )

Francisco Olarte.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andreas Kretschmer 2019-09-23 08:55:39 Re: pg_receivexlog or archive_command
Previous Message Andrew Gierth 2019-09-23 08:44:19 Re: How to get timezone offset in timestamp with time zone AT TIME ZONE output.