Re: new heapcheck contrib module

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: new heapcheck contrib module
Date: 2020-06-28 15:29:28
Message-ID: CAFiTN-sP5iFJGERTq4a4OLTCzjZKhLC5P6_k2bxQKXnuvdmzGA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 22, 2020 at 5:44 AM Mark Dilger
<mark(dot)dilger(at)enterprisedb(dot)com> wrote:
>
>
>
> > On Jun 21, 2020, at 2:54 AM, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > I have looked into 0001 patch and I have a few comments.
> >
> > 1.
> > +
> > + /* Skip over unused/dead/redirected line pointers */
> > + if (!ItemIdIsUsed(ctx.itemid) ||
> > + ItemIdIsDead(ctx.itemid) ||
> > + ItemIdIsRedirected(ctx.itemid))
> > + continue;
> >
> > Isn't it a good idea to verify the Redirected Itemtid? Because we
> > will still access the redirected item id to find the
> > actual tuple from the index scan. Maybe not exactly at this level,
> > but we can verify that the link itemid store in that
> > is within the itemid range of the page or not.
>
> Good idea. I've added checks that the redirection is valid, both in terms of being within bounds and in terms of alignment.
>
> > 2.
> >
> > + /* Check for tuple header corruption */
> > + if (ctx->tuphdr->t_hoff < SizeofHeapTupleHeader)
> > + {
> > + confess(ctx,
> > + psprintf("t_hoff < SizeofHeapTupleHeader (%u < %u)",
> > + ctx->tuphdr->t_hoff,
> > + (unsigned) SizeofHeapTupleHeader));
> > + fatal = true;
> > + }
> >
> > I think we can also check that if there is no NULL attributes (if
> > (!(t_infomask & HEAP_HASNULL)) then
> > ctx->tuphdr->t_hoff should be equal to SizeofHeapTupleHeader.
>
> You have to take alignment padding into account, but otherwise yes, and I've added a check for that.
>
> > 3.
> > + ctx->offset = 0;
> > + for (ctx->attnum = 0; ctx->attnum < ctx->natts; ctx->attnum++)
> > + {
> > + if (!check_tuple_attribute(ctx))
> > + break;
> > + }
> > + ctx->offset = -1;
> > + ctx->attnum = -1;
> >
> > So we are first setting ctx->offset to 0, then inside
> > check_tuple_attribute, we will keep updating the offset as we process
> > the attributes and after the loop is over we set ctx->offset to -1, I
> > did not understand that why we need to reset it to -1, do we ever
> > check for that. We don't even initialize the ctx->offset to -1 while
> > initializing the context for the tuple so I do not understand what is
> > the meaning of the random value -1.
>
> Ahh, right, those are left over from a previous design of the code. Thanks for pointing them out. They are now removed.
>
> > 4.
> > + if (!VARATT_IS_EXTENDED(chunk))
> > + {
> > + chunksize = VARSIZE(chunk) - VARHDRSZ;
> > + chunkdata = VARDATA(chunk);
> > + }
> > + else if (VARATT_IS_SHORT(chunk))
> > + {
> > + /*
> > + * could happen due to heap_form_tuple doing its thing
> > + */
> > + chunksize = VARSIZE_SHORT(chunk) - VARHDRSZ_SHORT;
> > + chunkdata = VARDATA_SHORT(chunk);
> > + }
> > + else
> > + {
> > + /* should never happen */
> > + confess(ctx,
> > + pstrdup("toast chunk is neither short nor extended"));
> > + return;
> > + }
> >
> > I think the error message "toast chunk is neither short nor extended".
> > Because ideally, the toast chunk should not be further toasted.
> > So I think the check is correct, but the error message is not correct.
>
> I agree the error message was wrongly stated, and I've changed it, but you might suggest a better wording than what I came up with, "corrupt toast chunk va_header".
>
> > 5.
> >
> > + ctx.rel = relation_open(relid, ShareUpdateExclusiveLock);
> > + check_relation_relkind_and_relam(ctx.rel);
> > +
> > + /*
> > + * Open the toast relation, if any, also protected from concurrent
> > + * vacuums.
> > + */
> > + if (ctx.rel->rd_rel->reltoastrelid)
> > + {
> > + int offset;
> > +
> > + /* Main relation has associated toast relation */
> > + ctx.toastrel = table_open(ctx.rel->rd_rel->reltoastrelid,
> > + ShareUpdateExclusiveLock);
> > + offset = toast_open_indexes(ctx.toastrel,
> > ....
> > + if (TransactionIdIsNormal(ctx.relfrozenxid) &&
> > + TransactionIdPrecedes(ctx.relfrozenxid, ctx.oldestValidXid))
> > + {
> > + confess(&ctx, psprintf("relfrozenxid %u precedes global "
> > + "oldest valid xid %u ",
> > + ctx.relfrozenxid, ctx.oldestValidXid));
> > + PG_RETURN_NULL();
> > + }
> >
> > Don't we need to close the relation/toastrel/toastindexrel in such
> > return which is without an abort? IIRC, we
> > will get relcache leak WARNING on commit if we left them open in commit path.
>
> Ok, I've added logic to close them.
>
> All changes inspired by your review are included in the v9-0001 patch. The differences since v8 are pulled out into v9_diffs for easier review.

I have reviewed the changes in v9_diffs and looks fine to me.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2020-06-28 16:05:19 Re: new heapcheck contrib module
Previous Message Dilip Kumar 2020-06-28 15:26:51 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions