Quick Links

Re: logical changeset generation v6.7

From:	Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To:	andres(at)2ndquadrant(dot)com
Cc:	robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: logical changeset generation v6.7
Date:	2013-12-10 06:43:18
Message-ID:	20131210.154318.209683106.horiguchi.kyotaro@lab.ntt.co.jp
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hello, sorry for annoying you with meaningless questions.
Your explanation made it far clearer to me.

This will be the last message I mention on this patch..

> On 2013-12-05 22:03:51 +0900, Kyotaro HORIGUCHI wrote:
> > > > - You assined HeapTupleGetOid(tuple) into relid to read in
> > > > several points but no modification. Nevertheless, you left
> > > > HeapTupleGetOid not replaced there. I think 'relid' just for
> > > > repeated reading has far small merit compared to demerit of
> > > > lowering readability. You'd be better to make them uniform in
> > > > either way.
> > >
> > > It's primarily to get the line lengths halfway under control.
> >
> > Mm. I'm afraid I couldn't catch your words, do you mean that
> > IsSystemClass() or isTempNamespace() could change the NULL bitmap
> > in the tuple?
>
> Huh? No. I just meant that the source code lines are longer if you use
> "HeapTupleGetOid(tuple)" instead of just "relid". Anway, that patch has
> since been committed...

Really? Sorry for annoying you. Thank you, I understand that.

> > > > ===== 0002:
> > > >
> > > > - You are identifying the wal_level with the expr
evel >=
> > > > WAL_LEVEL_LOGICAL' but it seems somewhat improper.
> > >
> > > Hm. Why?
> >
> > It actually does no harm and somewhat trifling so I don't
> > you should fix it.
> >
> > The reason for the comment is the greater values for vel
> > are undefined at the moment, so strictly saying, such values
> > should be handled as invalid ones.
>
> Note that other checks for wal_level are written the same
> way. Consider how much bigger this patch would be if every
> usage of wal_level would need to get changed because a new
> level had been added.

I know the objective. But it is not obvious that the future value
will need the process. Anyway we never know that until it
actually comes so I said it trifling.

> > > > - RelationIsAccessibleInLogicalDecoding and
> > > > RelationIsLogicallyLogged are identical in code. Together with
> > > > the name ambiguity, this is quite confising and cause of
> > > > future misuse between these macros, I suppose. Plus the names
> > > > seem too long.
> > >
> > > Hm, don't think they are equivalent, rather the contrary. Note one
> > > returns false if IsCatalogRelation(), the other true.
> >
> > Oops, I'm sorry. I understand they are not same. Then I have
> > other questions. The name for the first one
> > 'RelationIsAccessibleInLogicalDecoding' doesn't seem representing
> > what its comment reads.
> >
> > | /* True if we need to log enough information to have access via
> > | decoding snapshot. */
> >
> > Making the macro name for this comment directly, I suppose it
> > would be something like 'NeedsAdditionalInfoInLogicalDecoding' or
> > more directly 'LogicalDeodingNeedsCids' or so..
>
> The comment talks about logging enough information that it is accessible
> - just as the name.

Though I'm not convinced for that. But since it also seems not so
wrong and you say so, I pretend to be convinced:-p

> > > > - In heap_insert, the information conveyed on rdata[3] seems to
> > > > be better to be in rdata[2] because of the scarecity of the
> > > > additional information. XLOG_HEAP_CONTAINS_NEW_TUPLE only
> > > > seems to be needed. Also is in heap_multi_insert and
> > > > heap_upate.
> > >
> > > Could you explain a bit more what you mean by that? The reason it's a
> > > separate rdata entry is that otherwise a full page write will remove the
> > > information.
> >
> > Sorry, I missed the comment 'so that an eventual FPW doesn't
> > remove the tuple's data'. Although given the necessity of removal
> > prevention, rewriting rdata[].buffer which is required by design
> > (correct?) with InvalidBuffer seems a bit outrageous for me and
> > obfuscating the objective of it.
>
> Well, it's added in a separate rdata element. Just as in dozens of other
> places.

Mmmm. Was there any rdata entriy which has substantial content
but .buffer is set to InvalidBuffer just for avoiding removal by
fpw? Although for the objection I made, I became to be accostomed
to see there and I became to think it is not so bad.. I put an
end by this comment.

> > > > - In heap_delete, at the point adding replica identity, same to
> > > > the aboves, rdata[3] could be moved into rdata[2] making new
> > > > type like 'xl_heap_replident'.
> > >
> > > Hm. I don't think that'd be a good idea, because we'd then need special
> > > case decoding code for deletes because the wal format would be different
> > > for inserts/updates and deletes.
> >
> > Hmm. Although one common xl_heap_replident is impractical,
> > splitting a logcally single entity into two or more XLogRecDatas
> > also seems not to be a good idea.
>
> That's done everywhere. There's basically two reasons to use separate
> rdata elements:
> * the buffers are different
> * the data pointer is different
>
> The rdata chain elements don't survive in the WAL.

Well, I came to see rdata's as simple containers holding
fragments to be written into WAL stream. Thanks for patiently
answering for such silly questions.

> > Considering above, looking heap_xlog_insert(), you marked on
> > xlrec.flags with XLOG_HEAP_CONTAINS_NEW_TUPLE to signal decoder
> > that the record should have tuple data not being removed by fpw.
> > This is the same for the update record. So the redoer(?) also can
> > distinguish whether the update record contains extra tuple data
> > or not.
>
> But it doesn't know the length of the individual records, so knowing
> there are two doesn't help.

The length can be attached only when
XLOG_HEAP_CONTAINS_NEW_TUPLE, but it is right for you to say
that's too complicated.

> > On the other hand, the update record after patched is longer by
> > sizeof(uint16) regardless of whether 'tuple data' is attached or
> > not. I don't know the value of the size in WAL stream, but the
> > smaller would be better maybe.
>
> I think that'd make things too complicated without too much gain in
> comparison to the savings.

As written above, I'm convinced with that.

> > # Of course, it doesn't matter if the object for OLD can
> > # naturally be missing as English.
>
> Well, I think the context makes it clear enough.

Thank you. I learned that, perhaps :-)

> > I'm not a native English speaker as you know:-)
>
> Neither am I ;).

Yeah, you should've been more experienced than me :-p

> > undefining XLOG_HEAP_CONTAINS_OLD and use separte macros type 1
> > | if (xlrec->flags & XLOG_HEAP_CONTAINS_OLD_KEY ||
> > | xlrec->flags & XLOG_HEAP_CONTAINS_OLD_TUPLE)
> > | {
> > (I belive this should be optimized by the compiler:-)
>
> It's not about efficiency, imo the other variant looks clearer. And will
> continue to work if we add the option to selectively log columns or
> such.

Well, It's ok for me after knowing that the 'OLD' alone makes
sense.

> > if (RelationIsAccessibleInLogicalDecoding(relation))
> > + {
> > | rdata_heap_new_cid(&rdata[0], relation, heaptup);
> > + xlrec.flags |= XLOG_HEAP_CONTAINS_NEW_CID;
> > + }
> > + else
> > + rdata_void(&rdata[0])
> > + rdata[0].next = &(rdata[1]);
> >
> > xlrec.flags = all_visible_cleared ? XLOG_HEAP_ALL_VISIBLE_CLEARED : 0;
> > xlrec.target.node = relation->rd_node;
> > xlrec.target.tid = heaptup->t_self;
> > | rdata[1].data = (char *) &xlrec;
> >
> > If you don't agree with this, I don't say no more about this.
>
> It's a lot more complex than that. This throws of all kind of size
> calculations. and it makes the redo functions more complex - and they
> are much more often executed for !CONTAINS_NEW_CID.

Mmm. I don't know the exact reason for omitting length field in
previous xlog format. I've supposed the saved bytes are more
significant than the calculation clocks. But it sould be wrong or
the efficiency was defeated by the expected coplexity, supposing
from that it is alrady committed.

> > > > - The macro name INDEX_ATTR_BITMAP_KEY should be
> > > > INDEX_ATTR_BITMAP_FKEY. And INDEX_ATTR_BITMAP_IDENTITY_KEY
> > > > should be INDEX_ATTR_BITMAP_REPLID_KEY or something.
> > >
> > > But INDEX_ATTR_BITMAP_KEY isn't just about foreign keys... But I agree
> > > that INDEX_ATTR_BITMAP_IDENTITY_KEY should be renamed.
> >
> > Thank you, please remember to do that.
>
> I am not sure it's a good idea anymore, it doesn't really seem to
> increase clarity...

And also might be excessive. But it seemed somewhat unclear to
unaccustomed eyes:-)

Thank you for all your answers.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Re: logical changeset generation v6.7 at 2013-12-05 16:45:51 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Pavel Stehule	2013-12-10 06:45:52	Re: plpgsql_check_function - rebase for 9.3
Previous Message	Pavel Stehule	2013-12-10 06:30:41	Re: plpgsql_check_function - rebase for 9.3