From: | John Naylor <john(dot)naylor(at)2ndquadrant(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)bowt(dot)ie> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Problems with the FSM, heap fillfactor, and temporal locality |
Date: | 2020-08-26 08:45:54 |
Message-ID: | CACPNZCtjHMx1wd=p-dgu=X1yTjTxmY0zheb4bO2bv9yvxp-Deg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Aug 25, 2020 at 5:17 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> I think that the sloppy approach to locking for the
> fsmpage->fp_next_slot field in functions like fsm_search_avail() (i.e.
> not using real atomic ops, even though we could) is one source of
> problems here. That might end up necessitating fixing the on-disk
> format, just to get the FSM to behave sensibly -- assuming that the
> value won't be too stale in practice is extremely dubious.
>
> This fp_next_slot business interacts poorly with the "extend a
> relation by multiple blocks" logic added by commit 719c84c1be5 --
> concurrently inserting backends are liable to get the same heap block
> from the FSM, causing "collisions". That almost seems like a bug IMV.
> We really shouldn't be given out the same block twice, but that's what
> my own custom instrumentation shows happens here. With atomic ops, it
> isn't a big deal to restart using a compare-and-swap at the end (when
> we set/reset fp_next_slot for other backends).
The fact that that logic extends by 20 * numwaiters to get optimal
performance is a red flag that resources aren't being allocated
efficiently. I have an idea to ignore fp_next_slot entirely if we have
extended by multiple blocks: The backend that does the extension
stores in the FSM root page 1) the number of blocks added and 2) the
end-most block number. Any request for space will look for a valid
value here first before doing the usual search. If there is then the
block to try is based on a hash of the xid. Something like:
candidate-block = prev-end-of-relation + 1 + (xid % (num-new-blocks))
To guard against collisions, then peak in the FSM at that slot and if
it's not completely empty, then search FSM using a "look-nearby" API
and increment a counter every time we collide. When the counter gets
to some-value, clear the special area in the root page so that future
backends use the usual search.
I think this would work well with your idea to be more picky if the
xid stored with the relcache target block doesn't match the current
one.
Also num-new-blocks above could be scaled down from the actual number
of blocks added, just to make sure writes aren't happening all over
the place.
There might be holes in this idea, but it may be worth trying to be
better in this area without adding stricter locking.
--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro Horiguchi | 2020-08-26 09:13:23 | Re: "cert" + clientcert=verify-ca in pg_hba.conf? |
Previous Message | Kyotaro Horiguchi | 2020-08-26 08:25:25 | Re: Strange behavior with polygon and NaN |