Re: Inaccuracy in VACUUM's tuple count estimates

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Inaccuracy in VACUUM's tuple count estimates
Date: 2014-06-11 12:54:11
Message-ID: 20140611125411.GV8406@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-06-09 11:24:22 -0700, Kevin Grittner wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > On 2014-06-09 09:45:12 -0700, Kevin Grittner wrote:
>
> > I am not sure, given predicate.c's coding, how
> > HEAPTUPLE_DELETE_IN_PROGRESS could cause problems. Could you elaborate,
> > since that's the contentious point with Tom? Since 'both in
> > progress'
> > can only happen if xmin and xmax are the same toplevel xid and you
> > resolve subxids to toplevel xids I think it should currently be safe
> > either way?
>
> The only way that it could be a problem is if the DELETE is in a
> subtransaction which might get rolled back without rolling back the
> INSERT.

The way I understand the code in that case the subxid in xmax would have
been resolved the toplevel xid.

/*
* Find top level xid. Bail out if xid is too early to be a conflict, or
* if it's our own xid.
*/
if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
return;
xid = SubTransGetTopmostTransaction(xid);
if (TransactionIdPrecedes(xid, TransactionXmin))
return;
if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
return;

That should essentially make that case harmless, right? So it seems the
optimization (and pessimization in other cases) of only tracking
toplevel xids seems to save the day here?

>  If we ignore the conflict because we assume the INSERT
> will be negated by the DELETE, and that doesn't happen, we would
> get false negatives which would compromise correctness.  If we
> assume that the DELETE might not happen when the DELETE is not in a
> separate subtransaction we might get a false positive, which would
> only be a performance hit.  If we know either is possible and have
> a way to check in predicate.c, it's fine to check it there.

Given the above I don't think this currently can happen. Am I understand
it correctly? If so, it certainly deserves a comment...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-06-11 12:56:06 Re: replication commands and log_statements
Previous Message Magnus Hagander 2014-06-11 12:50:34 Re: replication commands and log_statements