Re: serialization errors

From: Ryan VanderBijl <rvbijl-pgsql(at)vanderbijlfamily(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: serialization errors
Date: 2003-01-30 04:39:30
Message-ID: 20030130043930.GE32349@vanderbijlfamily.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Jan 29, 2003 at 12:59:10AM -0500, Tom Lane wrote:
> Ryan VanderBijl <rvbijl(at)vanderbijlfamily(dot)com> writes:
> > For adding a new node, I basically do this:
> > insert into tree_node(parent_id, node_order)
> > values(1, (select max(node_order)+1 from tree_node where parent_id = 1) );
>
> That "select max()+1" makes me itch ;-) ... that's as perfect a recipe
> for concurrency problems as I can imagine.
>
> At first glance it seems that all this is doing is assigning sequential
> node_order values to the children of any particular parent. Is it
> really necessary that those node_order values be consecutive? If they
> only need to be strictly increasing, you could use a sequence to
> generate them. (Actually, I think you could dispense with node_order
> entirely, and use the node_id field instead.)
>
> In any case, I'd suggest some careful thought about what your data
> structure really is, and how you could avoid creating serialization
> bottlenecks like this one.

I'm open to suggestions of a better way to store tree structures in a
database ...

My user's really do need to be able to manage the order of the children.
For example, if the tree looks like:
Box 1
- document A
- document B
- document C
Box 2
- thing Z
- thing Y
- thing X

I need to be able drag 'thing Y' and place it between document B and
document C. (And, at the same time, I need to let separate people add
items under both boxes.) (And within several seconds, all my other
clients need to see the same change!)

Thus I need to keep track of the node_order field. I've thought
about using non-adjacent numbers, (e.g. 10, 20), and then when dropping
inbetween fill in the gaps. However, i'd have to worry about when
there are no available numbers in between, and it just seems that if
I always keep them adjacent, life will be easier later on.

However, back to serialization issue ... i'm going to take one (er,
actually there are two arguments) more stab at convincing you a unique
constraint error should be flagged as a serial error (a "serializable
error: unique constraint violated"?)

1. According to the docs, a serializable transaction: is "as if
transactions had been executed one after another, serially, rather
than concurrently."

Thus any problem because of concurrently commited data is an error
trying to serialize the transactions.

2. This change shouldn't [love that word] affect anyone. There will
already be a loop for retrying the transaction [assumption].
If the change is made, existing programs will receive a "serializable
error" instead of "unique constraint violated" error, then retry the
transaction, and *then* get the unique constraint violated error.

What's the beneffit of changing: it makes dealing with serializable
transactions more simple to think about, and more simple to work with.
At the very least, it removes one more special case that programmers
need to deal with when doing serializable transactions.

Also, if I hadn't mentioned it before, at the very least, the web
documentation on serializable transactions should be updated to
mention that unique constraints violations from concurrent updates
are NOT generated as serializable errors. [note: I can't recall
if this applies to both inserts and updates or just to inserts]

Thanks for your time!

Ryan

ps. I've had some problems getting this message out. Sorry if I
have sent duplicates.
--
Ryan VanderBijl rvbijl(at)vanderbijlfamily(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Justin Clift 2003-01-30 04:44:47 Re: Website troubles
Previous Message Mintoo Lall 2003-01-30 04:35:53 Postgres server output logfile