Re: Inaccuracy in VACUUM's tuple count estimates

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Inaccuracy in VACUUM's tuple count estimates
Date: 2014-06-11 13:24:41
Message-ID: 1402493081.56534.YahooMailNeo@web122302.mail.ne1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-06-09 11:24:22 -0700, Kevin Grittner wrote:

>> The only way that it could be a problem is if the DELETE is in a
>> subtransaction which might get rolled back without rolling back the
>> INSERT.
>
> The way I understand the code in that case the subxid in xmax would have
> been resolved the toplevel xid.
>
>     /*
>     * Find top level xid.  Bail out if xid is too early to be a conflict, or
>     * if it's our own xid.
>     */
>     if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
>         return;
>     xid = SubTransGetTopmostTransaction(xid);
>     if (TransactionIdPrecedes(xid, TransactionXmin))
>         return;
>     if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
>         return;
>
> That should essentially make that case harmless, right?

Hmm.  Since the switch statement above doesn't limit the
HEAPTUPLE_DELETE_IN_PROGRESS or HEAPTUPLE_INSERT_IN_PROGRESS cases
by visibility, we are safe.  I had been thinking that we had early
returns based on visibility, like the we do for HEAPTUPLE_LIVE and
HEAPTUPLE_RECENTLY_DEAD.  Since we don't do that, there is no
problem with the code either before or after your recent change.
Apologies that I didn't notice that sooner.

It might be possible that with more guarantees of what those return
codes mean (possibly by adding the new one mentioned by Tom) we
could add that early return in one or both cases, which would
better optimize some corner cases involving subtransactions.  But
now doesn't seem like a good time to worry about that.  Such a
micro-optimization would not be a sane candidate for back-patching,
or for introducing this late in a release cycle.

So, nothing to see here, folks.  Move along.  predicate.c is
neutral in this matter.

Good spot, BTW!

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-06-11 13:39:36 Re: [HACKERS] Question about partial functional indexes and the query planner
Previous Message Fujii Masao 2014-06-11 12:59:02 Re: pg_receivexlog add synchronous mode