| From: | Robert Haas <robertmhaas(at)gmail(dot)com> | 
|---|---|
| To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> | 
| Cc: | Tatsuo Ishii <ishii(at)postgresql(dot)org>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: BRIN index and aborted transaction | 
| Date: | 2015-07-21 20:47:00 | 
| Message-ID: | CA+Tgmoa=j9J8gGwbxttuKWk=KOqJNkTCo9djVhbLAmO1t390-g@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Sat, Jul 18, 2015 at 5:11 AM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> Yeah, that's a bit of an open problem: we don't have any mechanism to
> mark a block range as needing resummarization, yet.  I don't have any
> great ideas there, TBH.  Some options that were discussed but never led
> anywhere:
>
> 1. whenever a heap tuple is deleted that's minimum or maximum for a
> column, mark the index tuple as needing resummarization.  One a future
> vacuuming pass the index would be updated.  (I think this works for
> minmax, but I don't see how to apply it to inclusion).
>
> 2. have block ranges be resummarized randomly during vacuum.
>
> 3. Have index tuples last for only X number of transactions, marking the
> as needing summarization when that expires.
>
> 4. Have a user-invoked function that re-runs summarization.  That way
> the user can implement any of the above policies, or others.
Maybe I'm confused here, but it seems like the only time
re-summarization can be needed is when tuples are pruned.  The mere
act of deleting a tuple, even if the delete goes on to commit, doesn't
create a scenario where re-summarization can work out to a win,
because there may still be snapshots that can see it.  At the point
where we prune the tuple, though, there might well be a benefit in
re-summarizing, because now a newly-computed summary value won't need
to cover a value that previously had to be there.
But it seems obviously impractical to re-summarize when we HOT-prune,
so it seems like the obvious thing to do is make vacuum do it.  We
know during phase one of vacuum whether we saw any dead tuples in page
range X-Y; if yes, re-summarize.  The only reason not to do this is if
it causes us to do a lot of resummarization that frequently fails to
produce a smaller range. Do you have any experimental data suggesting
that this is or is not a problem?
-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Robert Haas | 2015-07-21 20:52:04 | Re: Arguable RLS security bug, EvalPlanQual() paranoia | 
| Previous Message | Robert Haas | 2015-07-21 20:24:22 | Re: [PROPOSAL] VACUUM Progress Checker. |