From: | Vijaykumar Jain <vijaykumarjain(dot)github(at)gmail(dot)com> |
---|---|
To: | Mike Beachy <mbeachy(at)gmail(dot)com> |
Cc: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: -1/0 virtualtransaction |
Date: | 2021-04-27 17:46:48 |
Message-ID: | CAM+6J95nkXHiGPw1=i7WPcAaz7ZqiL4OXdWhfeV8636y879AZQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi,
I am just trying to jump in, but ignore if not relevant.
when you said *Eventually this results in an "out of shared memory"
error *
Can you rule out the below two scenarios (wrt /dev/shm too low in docker or
query requesting for too many locks either due to parallellism/partition
involved)
There have been multiple cases of out of shared memory i have read earlier
for due to above.
PostgreSQL: You might need to increase max_locks_per_transaction
(cybertec-postgresql.com)
<https://www.cybertec-postgresql.com/en/postgresql-you-might-need-to-increase-max_locks_per_transaction/>
PostgreSQL at low level: stay curious! · Erthalion's blog
<https://erthalion.info/2019/12/06/postgresql-stay-curious/#2-shared-memory>
also, is this repeatable (given you mention it happens and eventually lead
to "out of shared memory")
I may be missing something, but i do not see a PID even though it has a
lock granted on a page, was the process terminated explicitly or
implicitly. ( and an orphan lingering ? )
ps auwwxx | grep postgres
I took the below from "src/test/regress/sql/tidscan.sql" to simulate
SIReadLock with an orphan process (by killing the process), but it gets
reaped fine for me :(
postgres=# \d tidscan
Table "public.tidscan"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
id | integer | | |
postgres=# INSERT INTO tidscan VALUES (1), (2), (3);
postgres=# BEGIN ISOLATION LEVEL SERIALIZABLE;
BEGIN
postgres=*# SELECT * FROM tidscan WHERE ctid = '(0,1)';
id
----
1
(1 row)
postgres=*# -- locktype should be 'tuple'
SELECT locktype, mode FROM pg_locks WHERE pid = pg_backend_pid() AND mode =
'SIReadLock';
locktype | mode
----------+------------
tuple | SIReadLock
(1 row)
postgres=*# -- locktype should be 'tuple'
SELECT pid, locktype, mode FROM pg_locks WHERE mode = 'SIReadLock';
pid | locktype | mode
------+----------+------------
2831 | tuple | SIReadLock
(1 row)
i thought one could attach a gdb or strace to the pid to figure out what it
did before crashing.
As always, I have little knowledge on postgresql, feel free to ignore if
nothing relevant.
Thanks,
Vijay
On Tue, 27 Apr 2021 at 19:55, Mike Beachy <mbeachy(at)gmail(dot)com> wrote:
> Hi Laurenz -
>
> On Tue, Apr 27, 2021 at 2:56 AM Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
> wrote:
>
>> Not sure, but do you see prepared transactions in "pg_prepared_xacts"?
>>
>
> No, the -1 in the virtualtransaction (
> https://www.postgresql.org/docs/11/view-pg-locks.html) for
> pg_prepared_xacts was another clue I saw! But, it seems more or less a dead
> end as I have nothing in pg_prepared_xacts.
>
> Thanks for the idea, though.
>
> I still need to put more effort into Tom's idea about SIReadLock hanging
> out after the transaction, but some evidence pointing in this direction is
> that I've reduced the number of db connections and found that the '-1/0'
> locks will eventually go away! I interpret this as the db needing to find
> time when no overlapping read/write transactions are present. This doesn't
> seem completely correct, as I don't have any long lived transactions
> running while these locks are hanging out. Confusion still remains, for
> sure.
>
> Mike
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2021-04-27 20:16:01 | Re: very long secondary->primary switch time |
Previous Message | Tomas Pospisek | 2021-04-27 17:15:03 | very long secondary->primary switch time |