Jesus, what have I done (was: LONG)

From: wieck(at)debis(dot)com (Jan Wieck)
To: pgman(at)candle(dot)pha(dot)pa(dot)us (Bruce Momjian)
Cc: wieck(at)debis(dot)com, pgsql-hackers(at)postgreSQL(dot)org
Subject: Jesus, what have I done (was: LONG)
Date: 1999-12-12 01:33:08
Message-ID: m11wxsq-0003kJC@orion.SAPserv.Hamburg.dsh.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Bruce Momjian wrote (in several messages):

> Actually, in looking at the fsync() system call, it does write the
> entire file descriptor before marking the transaction as complete, so
> there is no hard reason not to raise it, but because the OS has to do
> two reads to get 16k, I think we are better keeping 8k as our base block
> size.

Agreed. Let's stay with the 8K default.

> -1|oid|attno

Actually I think you need two more informations to move it
around independently. As you agreed somewhere else (on my
TRUNCATE issue), it would be better to keep the long values
in a per table expansion relation. Thus, you need the Oid of
that too at least. Also, it would be good to know the size of
the data before fetching it, so you need that to.

But that's not the important issue, there's also an (IMHO
dangerous) assumption on it, see below.

> Now it would be nice if we could set the varlena length to 12, it's
> actual length, and then just somehow know that the varlena of 12 was a
> long data entry. Our current varlena has a maximum length of 64k.
>
> Or make struct varlena vl_len a 15-bit field, and make islong a 1-bit
> field. I don't remember if using & manually or bit fields is faster.

I don't see vl_len as a 15-bit field. In the current sources
(in postgres.h), it is an int32. And I'm sure that not any
code is aware that some magic bit's in it contain a special
meaning. At least the types I added recently (numeric and
lztext) aren't. Nor am I sure, a variable length Datum is
never duplicated somewhere, just by using the information
from vl_len, with or without using the macro. Thus we would
have to visit alot of code to make sure this new variable
length Datum can be passed around as you like.

And the IMHO most counting drawback is, that existing user
type definitions treat the first 32 bits in a variable length
data type just as I interpreted the meaning up to now. So we
could occationally break more than we are aware of.

> In your system, a sequential scan of the table will pull in all this
> stuff because you are going to expand the tuple. That could be very
> costly. In my system, the expansion only happens on output if they LONG
> field does not appear in the WHERE or ORDER BY clauses.

In my system, it would do exactly as in your's, because they are mostly the
same. The modification done to the tuple in heap_insert() and heap_replace(),
just before the call to RelationPutHeapTupleAtEnd(), makes each
LONG Datum of varsize 20. Just that the first 32 bits don't contain any
magic information.

> > Maybe we make this mechanism so general that it is
> > automatically applied to ALL varsize attributes? We'll end up
> > with on big pg_long where 90+% of the databases content will
> > be stored.
>
> If most joins, comparisons are done on the 10% in the main table, so
> much the better.

Yes, but how would you want to judge which varsize value to
put onto the "secondary" relation, and which one to keep in
the "primary" table for fast comparisions?

I think you forgot one little detail. In our model, you can
only move around the Datum's extended information around as
is. It will never be expanded in place, so it must be fetched
(index scan) again at any place, the value itself is
required.

The installed base currently uses varsize attributes with
indices on them to condition, sort and group on them. Now
pushing such a field into "secondary" occationally will cause
a substantial loss of performance.

So again, how do you determine which of the attributes is a
candidate to push into "secondary"? It is a such generic
approach, that I cannot imagine any fail safe method.

I'd better like to have another LONG data type, that enables
me to store huge string into but where I exactly know what I
can't do with, than having some automatic detection process
that I cannot force to do what I want. It happened just to
often to me, that these "user friendly better knowing what I
might want" systems got me by the ball's. I'm a real
programmer, so there's allway a way out for me, but what
shoud a real user do?

> Let's fact it. Most long tuples are store/retrieve, not ordered on or
> used in WHERE clauses. Moving them out of the main table speeds up
> things. It also prevents expansion of rows that never end up in the
> result set.

Having a tuple consisting of 30+ attributes, where 20 of them
are varsize ones (CHAR, VARCHAR, NUMERIC etc.), what makes it
a long tuple? Yes, I'm repeating this question once again,
because we're talking about a "one must fit all cases" here.

> stuff because you are going to expand the tuple. That could be very
> costly. In my system, the expansion only happens on output if they LONG
> field does not appear in the WHERE or ORDER BY clauses.

No I won't. As explained, I would return a tuple as is, just
with the LONG reference information. It will only, but then
allways again, be expanded if needed to compare, store again
or beeing output to the client. This "allways again" is one
of my drawbacks against your "treating all varsize pushable"
concept. In one of my early projects, I had to manage a
microVax for a year, and I love systems that can be fine
tuned since then, really! Auto detection is a nice feature,
but if that failes and you don't have any override option,
you're hosed.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#========================================= wieck(at)debis(dot)com (Jan Wieck) #

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 1999-12-12 02:06:49 Re: [HACKERS] Re: [PATCHES] pg_dump primary keys
Previous Message Bruce Momjian 1999-12-12 00:01:49 Re: [HACKERS] LONG