From: | Kevin Grittner <kgrittn(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Ants Aasma <ants(dot)aasma(at)eesti(dot)ee>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: what to revert |
Date: | 2016-05-04 21:22:41 |
Message-ID: | CACjxUsNEsr+wPpxNA=_z9nMv3ynDopNiongD92KPS3c_zdxsDA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, May 4, 2016 at 3:42 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> My concern isn't performing checks at snapshot build time and at buffer
> access time. That seems fairly reasonable. My concern is that the
> time->xid mapping used to determine the xid horizon is built at snapshot
> access time. The relevant function for that is
> MaintainOldSnapshotTimeMapping() called by GetSnapshotData(), if you
> want to look at it. Building the mapping when building the snapshot
> requires that an exclusive lwlock is taken. Given that there's very few
> access patterns in which more xids are assigned than snapshots taken,
> that just doesn't make much sense; even leaving the fact that adding an
> exclusive lock in a function that's already one of the biggest
> bottlenecks in postgres, isn't a good idea.
It seems to me that you are confusing the structure with the point
from which the function to call it is filled (and how often).
Some of the proposals involve fairly small tweaks to call
MaintainOldSnapshotTimeMapping() from elsewhere or only when
something changes (like crossing a minute boundary or seeing that a
new TransactionId has been assigned). If you can disentangle those
ideas, it might not look so bad.
> If we instead built the map somewhere around GetNewTransactionId() we'd
> only pay the cost when advancing an xid. It'd also make it possible to
> avoid another GetCurrentIntegerTimestamp() per snapshot, including the
> acquiration of oldSnapshotControl->mutex_current. In addition the data
> structure would be easier to maintain, because xids grow monotonically
> (yes, yes wraparound, but that's not a problem) - atm we need somwehat
> more complex logic to determine into which bucket an xid/timestamp pair
> has to be mapped to.
Right.
> So I'm really not just concerned with the performance, I think that's
> just fallout from choosing the wrong data representation.
Wrong. It's a matter of when the calls are made to maintain the
structure.
> If you look at at bottom of
> http://www.postgresql.org/message-id/20160413171955.i53me46fqqhdlztq@alap3.anarazel.de
> which shows performance numbers for old_snapshot_threshold=0
> vs. old_snapshot_threshold=10. While that wasn't intended at the time,
> old_snapshot_threshold=0 shows the cost of the feature without
> maintaining the mapping, and old_snapshot_threshold=10 shows it while
> building the mapping. Pretty dramatic difference - that's really the
> cost of map maintenance.
Right.
> FWIW, it's not just me doubting the data structure here, Ants did as
> well.
It is true that the patch he posted added one more field to a struct.
>> Surely nobody here is blind to the fact that holding back xmin often
>> creates a lot of bloat for no reason - many or all of the retained old
>> row versions may never be accessed.
>
> Definitely. I *like* the feature.
And I think we have agreed in off-list discussions that even with
these NUMA issues the patch, as it stands, would help some -- the
poor choice of a call site for MaintainOldSnapshotTimeMapping()
just unnecessarily limits how many workloads can benefit.
Hopefully you can understand the reasons I chose to focus on the
performance issues when it is disabled, and the hash index issue
before moving on to this.
>>>> So I think accepting the promise that this feature would be improved
>>>> in a future release to better support that case is good enough.
>>>
>>> I've not heard any such promise.
>>
>> The question Alvaro is raising isn't whether such a promise has been
>> made but whether it would suffice. In fairness, such promises are not
>> enforceable. Kevin can promise to work on this and then be run over
>> by a rogue Mr. Softee truck tomorrow, and the work won't get done; or
>> he can go to work for some non-PostgreSQL-supporting company and
>> vanish.
>
> Sure, it's not a guarantee. But I think a "promise" (signed in blood of
> course) of a senior contributor makes quite the difference.
How about we see where we are as we get closer to a go/no go point
and see where performance has settled and what kind of promise
would satisfy you. I'm uncomfortable with the demand for a rather
non-specific promise, and find myself flashing back to some
sections of the Catch 22 novel.
>> But personally, I generally agree with Alvaro's analysis. If this
>> patch is slow when turned on, and you don't like that, don't use it.
>
>> If you need to use it and want it to scale better, then you can write
>> a patch to make it do that. Nobody is more qualified than you.
>
> I think that's what ticks me off about this. By introducing a feature
> implemented with the wrong structure, the onus of work is placed on
> others. It's imo perfectly reasonable not to unneccesarily perform
> micro-optimization before benchmarks show a problem, and if it were just
> a question of doing that in 9.7, I'd be fine. But what we're talking
> about is rewriting the central part of the feature.
If you see a need for more than adding a very small number of
fields to any struct or function, please elaborate; if anyone else
has proposed a solution that requires that, I've missed it.
>> I also think it's a show-stopper if you can hold out specific design
>> issues which we can generally agree are signs of serious flaws in
>> Kevin's thinking.
>
> I think that's my issue.
Amit's proof of concept patch reduced the problem by a factor of 3x
to 4x with a local "if", without any structure changes, and Ants'
proposal looks like it will reduce the locks to once per minute
(potentially making them pretty hard to measure) by adding one
field to one struct -- so I'm really having trouble understanding
what you're getting at. Clarification welcome.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2016-05-04 21:26:59 | Re: pg9.6 segfault using simple query (related to use fk for join estimates) |
Previous Message | Tom Lane | 2016-05-04 21:02:04 | Re: pg9.6 segfault using simple query (related to use fk for join estimates) |