| From: | Yushi Ogiwara <btogiwarayuushi(at)oss(dot)nttdata(dot)com> | 
|---|---|
| To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> | 
| Cc: | Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: Fix for consume_xids advancing XIDs incorrectly | 
| Date: | 2024-10-18 05:57:36 | 
| Message-ID: | 57b588db6017ba20e67980fefa388b86@oss.nttdata.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Hi,
Thank you for your suggestion. Let me organize the discussion as 
follows:
When I created the patch, I did not account for situations where other 
transactions concurrently consume XIDs. If we establish the rule that 
"consume_xids should not return an XID that is consumed by any other 
transaction," I believe it will simplify the discussion.
Case: consume_xids(1)
Let's consider a scenario where transaction 1 (t1) and transaction 2 
(t2) run concurrently. t1 calls txid_current() and consume_xids(1), 
while t2 calls txid_current().
The schedule above illustates when t1 and t2 run concurrently. The "next 
XID" column shows how the TransamVariables->nextXid global variable 
changes over time. The /**/ blocks shows the value of each variable and 
outcome of a called function.
In my previous patch, the schedule above could occur, violating the rule 
because consume_xids does not update lastxid when GetTopTransactionId is 
called.
If your suggestion is applied, consume_xids would no longer violate the 
rule, as it would update lastxid when generating a new XID.
Case: consume_xids(0)
Now, consider the case where transaction 1 consumes 0 XIDs.
In this example, both t1 and t2 consume XID = 20, which violates the 
rule.
Using ReadNextFullTransactionId() - 1 also does not resolve the issue, 
as demonstrated above. This is because consume_xids(0) does not allocate 
any XIDs.
In conclusion, I agree that:
  	* Update lastxid with GetTopTransactionId().
  	* consume_xids with 0 should raise an error.
I have attached a new patch that incorporates your suggestions.
Regards,
Yushi
On 2024-10-17 02:58, Masahiko Sawada wrote:
> On Tue, Oct 15, 2024 at 10:06 PM Yushi Ogiwara
> <btogiwarayuushi(at)oss(dot)nttdata(dot)com> wrote:
> 
>> Hi,
>> 
>> Thank you for your comment.
>> 
>> Regarding the first patch, I believe it works correctly when
>> consume_xids(1) is called.  This is because the lastxid variable in 
>> the
>> consume_xids_common function is initialized as lastxid =
>> ReadNextFullTransactionId(), where the ReadNextFullTransactionId
>> function returns the (current XID) + 1.
> 
> But it's possible that concurrent transactions consume XIDs in 
> meanwhile, no?
> 
>> Separately, I found that consume_xids(0) does not behave as expected.
>> Below is an example:
>> 
>> postgres=# select txid_current();
>> txid_current
>> --------------
>> 45496
>> (1 row)
>> 
>> postgres=# select consume_xids(0);
>> consume_xids
>> --------------
>> 45497
>> (1 row)
>> 
>> postgres=# select consume_xids(0);
>> consume_xids
>> --------------
>> 45497
>> (1 row)
>> 
>> In the example, the argument to consume_xids is 0, meaning it should 
>> not
>> consume any XIDs. However, the first invocation of consume_xids(0) 
>> looks
>> like unexpectedly consuming 1 XID though it's not consuming actually.
>> This happens because consume_xids(0) returns the value from
>> ReadNextFullTransactionId.
>> 
>> I have updated the patch (skip.diff, attached to this e-mail) to 
>> address
>> this issue. Now, when consume_xids(0) is called, it returns
>> ReadNextFullTransactionId().value - 1, ensuring no XID is consumed as
>> shown below:
>> 
>> postgres=# select txid_current();
>> txid_current
>> --------------
>> 45498
>> (1 row)
>> 
>> postgres=# select consume_xids(0);
>> consume_xids
>> --------------
>> 45498
>> (1 row)
> 
> Hmm, I think if we expect this function to return the last XID that
> the function actually consumed, calling consume_xids with 0 should
> raise an error instead. Even if it returns
> ReadNextFullTransactionId().value - 1 as you proposed, other
> concurrent transactions might consume XIDs between txid_current() and
> consume_xids(0), resulting in consume_xids() appearing to have
> consumed XIDs.
> 
> Regards,
| Attachment | Content-Type | Size | 
|---|---|---|
| consume_xid_fix.diff | text/x-diff | 975 bytes | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Seino Yuki | 2024-10-18 06:00:50 | Re: Add “FOR UPDATE NOWAIT” lock details to the log. | 
| Previous Message | Antonin Houska | 2024-10-18 05:51:21 | Re: [PoC] Federated Authn/z with OAUTHBEARER |