| From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> | 
|---|---|
| To: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> | 
| Cc: | Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: Reviewing freeze map code | 
| Date: | 2016-06-21 04:59:53 | 
| Message-ID: | CAA4eK1J94FQJ8cEVdHmS2aDnPj99kGomScf+QjQLKBZ+6yaHbA@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Tue, Jun 21, 2016 at 9:08 AM, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
wrote:
>
> On Tue, Jun 21, 2016 at 3:29 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
> > On Tue, Jun 21, 2016 at 1:03 AM, Andres Freund <andres(at)anarazel(dot)de>
wrote:
> >> Well, I think generally nobody seriously looked at actually refactoring
> >> heap_update(), even though that'd be a good idea.  But in this
instance,
> >> the problem seems relatively fundamental:
> >>
> >> We need to lock the origin page, to do visibility checks, etc. Then we
> >> need to figure out the target page. Even disregarding toasting - which
> >> we could be doing earlier with some refactoring - we're going to have
to
> >> release the page level lock, to lock them in ascending order. Otherwise
> >> we'll risk kinda likely deadlocks.
> >
> > Can we consider to use some strategy to avoid deadlocks without
releasing
> > the lock on old page?  Consider if we could have a mechanism such that
> > RelationGetBufferForTuple() will ensure that it always returns a new
buffer
> > which has targetblock greater than the old block (on which we already
held a
> > lock).  I think here tricky part is whether we can get anything like
that
> > from FSM. Also, there could be cases where we need to extend the heap
when
> > there were pages in heap with space available, but we have ignored them
> > because there block number is smaller than the block number on which we
have
> > lock.
>
> Doesn't that mean that over time, given a workload that does only or
> mostly updates, your records tend to migrate further and further away
> from the start of the file, leaving a growing unusable space at the
> beginning, until you eventually need to CLUSTER/VACUUM FULL?
>
The request for updates should ideally fit in same page as old tuple for
many of the cases if fillfactor is properly configured, considering
update-mostly loads.  Why would it be that always the records will migrate
further away, they should get the space freed by other updates in
intermediate pages. I think there could be some impact space-wise, but
freed-up space will be eventually used.
> I was wondering about speculatively asking for a free page with a
> lower block number than the origin page, if one is available, before
> locking the origin page.
Do you wan't to lock it as well?  In any-case, I think adding the code
without deciding whether the update can be accommodated in current page can
prove to be costly.
>  Then after locking the origin page, if it
> turns out you need a page but didn't get it earlier, asking for a free
> page with a higher block number than the origin page.
>
Something like that might workout if it is feasible and people agree on
pursuing such an approach.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Amit Kapila | 2016-06-21 05:07:38 | Re: Reviewing freeze map code | 
| Previous Message | Michael Paquier | 2016-06-21 04:47:32 | Missing checks when malloc returns NULL... |