From: | Junwang Zhao <zhjwpku(at)gmail(dot)com> |
---|---|
To: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
Cc: | pgsql-bugs <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Bug with GIN index over JSONB data: "ERROR: buffer 10112 is not owned by resource owner" |
Date: | 2023-11-10 03:42:08 |
Message-ID: | CAEG8a3LkcfaUw=p2MF4w97RSc=EJpdWVziskH93xTX4Vs8npdA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Fri, Nov 10, 2023 at 9:44 AM Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>
> I was looking into a possible scalability problem with GIN indexes under concurrent insert, but instead I found an uncharacterized bug. One of the processes will occasionally throw an error "ERROR: buffer 10112 is not owned by resource owner Portal" where the buffer number changes from run to run.
>
> I've verified this with both 14.9 and 16.1, on ubuntu 22.04. I use an AWS m5.4xlarge machine, and haven't tried to verify it on anything else. I don't currently have any real hardware with enough CPUs to do a meaningful test.
>
> I've attached the "user data" file I feed to AWS to run the test, this one is for v14.9. The v16.1 is similar except I compile PostgreSQL myself (without JIT) rather than getting it from apt. I standup an ubuntu 22.04 m5.4xlarge machine with all the defaults, except changing the storage from 8GB to 80GB, and fed it the attached user data cloud init file.
>
> If you don't want to parse the meat out of the file, the core of the test is to run this command with some escalating level of concurrency in a loop. Each call just inserts one JSONB object with highly redundant keys (the same 10 keys present in every row) but a more distinctive value for each key.
>
> insert into j (j) select jsonb_object_agg(x::text, left(md5(random()::text),5)) from generate_series(1,10) f(x);
>
> I've never seen the error occur until the concurrency reaches at least 4, but sample size is too low for that to be definitive.
>
> Unless someone has some better idea, my next step will be to switch the column from jsonb to text[] and see if it exists there as well.
>
> I assume the synchronous_commit=off is needed because without it you couldn't accumulate enough trials to spot the bug, even though it would exist in that setting. I guess I could run the test on a machine with very fast SSD and leave synchronous_commit=on, but I'm not looking forward to the cost of renting a machine that can do that or figuring out how to configure it. I also haven't tried it with fastupdate on. I assume the test would not work because the pending list would grow without bound at high concurrencies (it would grow faster than a single-threaded cleaner could clean it) and so not seeing the bug would not mean it wasn't present.
>
> The test loops the insert for one minute, at each concurrency from 1 to 10, then starts over at -c 1 again. It seems like if you don't see the bug within the first 20 minutes (the first two 1-to-10 concurrency cycles) you are unlikely to see it at all. But that is more a hunch than a formal analysis.
>
> Cheers,
>
> Jeff
I can reproduce this by checking to e9f075f9a15593fe31c610e15cfc71a5fa281ede,
but master seems ok since Heikki has some ResourceOwner related patch
committed after that.
--
Regards
Junwang Zhao
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2023-11-10 04:31:02 | Re: Bug with GIN index over JSONB data: "ERROR: buffer 10112 is not owned by resource owner" |
Previous Message | Jeff Janes | 2023-11-10 01:44:21 | Bug with GIN index over JSONB data: "ERROR: buffer 10112 is not owned by resource owner" |