Re: Manipulating complex types as non-contiguous structures in-memory

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Manipulating complex types as non-contiguous structures in-memory
Date: 2015-02-12 14:50:18
Message-ID: 14262.1423752618@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Feb 10, 2015 at 3:00 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> BTW, I'm not all that thrilled with the "deserialized object" terminology.
>> I found myself repeatedly tripping up on which form was serialized and
>> which de-. If anyone's got a better naming idea I'm willing to adopt it.

> My first thought is that we should form some kind of TOAST-like
> backronym, like Serialization Avoidance Loading and Access Device
> (SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD). I
> don't think there is anything per se wrong with the terms
> serialization and deserialization; indeed, I used the same ones in the
> parallel-mode stuff. But they are fairly general terms, so it might
> be nice to have something more specific that applies just to this
> particular usage.

Hm. I'm not against the concept, but those particular suggestions don't
grab me.

> I found the notion of "primary" and "secondary" TOAST pointers to be
> quite confusing. I *think* what you are doing is storing two pointers
> to the object in the object, and a pointer to the object is really a
> pointer to one of those two pointers to the object. Depending on
> which one it is, you can write the object, or not.

There's more to it than that. (Writing more docs is one of the to-do
items ;-).) We could alternatively have done that with two different
va_tag values for "read write" and "read only", which indeed was my
initial intention before I thought of this dodge. However, then you
have to figure out where to store such pointers, which is problematic
both for plpgsql variable assignment and for ExecMakeSlotContentsReadOnly,
especially the latter which would have to put any freshly-made pointer
in a long-lived context resulting in query-lifespan memory leaks.
So I early decided that the read-write pointer should live right in the
object's own context where it need not be copied when swinging the
context ownership someplace else, and later realized that there should
also be a permanent read-only pointer in there for the use of
ExecMakeSlotContentsReadOnly, and then realized that they didn't need
to have different va_tag values if we implemented the "is read-write
pointer" test as it's done in the patch. Having only one va_tag value
not two saves cycles, I think, because there are a lot of low-level
tests that don't need to distinguish, eg VARTAG_SIZE(). However it
does make it more expensive when you do need to distinguish, so I might
reconsider that decision later. (Since these will never go to disk,
we can whack the representation around pretty freely if needed.)

Also, I have hopes of allowing deserialized-object pointers to be copied
into tuples as pointers rather than by reserialization, if we can
establish that the tuple is short-lived enough that the pointer will stay
good, which would be true in a lot of cases during execution of queries by
plpgsql. With the patch's design, a pointer so copied will automatically
be considered read-only, which I *think* is the behavior we'd need. If it
turns out that it's okay to propagate read-write-ness through such a copy
step then that would argue in favor of using two va_tag values.

It may be that this solution is overly cute and we should just use two
tag values. But I wanted to be sure it was possible for copying of a
pointer to automatically lose read-write-ness, in case we need to have
such a guarantee.

> This is a clever
> representation, but it's hard to wrap your head around, and I'm not
> sure "primary" and "secondary" are the best names, although I don't
> have an idea as to what would be better. I'm a bit confused, though:
> once you give out a secondary pointer, how is it safe to write the
> object through the primary pointer?

It's no different from allowing plpgsql to update the values of variables
of pass-by-reference types even though it has previously given out Datums
that are pointers to them: by the time we're ready to execute an
assignment, any query execution that had such a pointer is over and done
with. (This implies that cursor parameters have to be physically copied
into the cursor's execution state, which is one of a depressingly large
number of reasons why datumCopy() has to physically copy a deserialized
value rather than just copying the pointer. But otherwise it works.)

There is more work to do to figure out how we can safely give out a
read/write pointer for cases like
hstore_var := hstore_concat(hstore_var, ...);
Aside from the question of whether hstore_concat guarantees not to trash
the value on failure, we'd have to restrict this (I think) to expressions
in which there is only one reference to the target variable and it's an
argument of the topmost function/operator. But that's something I've not
tried to implement yet.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2015-02-12 14:55:43 Re: GSoC 2015 - mentors, students and admins.
Previous Message Tatsuo Ishii 2015-02-12 14:36:55 Re: Logical decoding document