Re: -1/0 virtualtransaction

From: Vijaykumar Jain <vijaykumarjain(dot)github(at)gmail(dot)com>
To: Mike Beachy <mbeachy(at)gmail(dot)com>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: -1/0 virtualtransaction
Date: 2021-04-27 17:46:48
Message-ID: CAM+6J95nkXHiGPw1=i7WPcAaz7ZqiL4OXdWhfeV8636y879AZQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

I am just trying to jump in, but ignore if not relevant.

when you said *Eventually this results in an "out of shared memory"
error *

Can you rule out the below two scenarios (wrt /dev/shm too low in docker or
query requesting for too many locks either due to parallellism/partition
involved)
There have been multiple cases of out of shared memory i have read earlier
for due to above.

PostgreSQL: You might need to increase max_locks_per_transaction
(cybertec-postgresql.com)
<https://www.cybertec-postgresql.com/en/postgresql-you-might-need-to-increase-max_locks_per_transaction/>
PostgreSQL at low level: stay curious! · Erthalion's blog
<https://erthalion.info/2019/12/06/postgresql-stay-curious/#2-shared-memory>

also, is this repeatable (given you mention it happens and eventually lead
to "out of shared memory")

I may be missing something, but i do not see a PID even though it has a
lock granted on a page, was the process terminated explicitly or
implicitly. ( and an orphan lingering ? )
ps auwwxx | grep postgres

I took the below from "src/test/regress/sql/tidscan.sql" to simulate
SIReadLock with an orphan process (by killing the process), but it gets
reaped fine for me :(

postgres=# \d tidscan
Table "public.tidscan"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
id | integer | | |

postgres=# INSERT INTO tidscan VALUES (1), (2), (3);

postgres=# BEGIN ISOLATION LEVEL SERIALIZABLE;
BEGIN
postgres=*# SELECT * FROM tidscan WHERE ctid = '(0,1)';
id
----
1
(1 row)

postgres=*# -- locktype should be 'tuple'
SELECT locktype, mode FROM pg_locks WHERE pid = pg_backend_pid() AND mode =
'SIReadLock';
locktype | mode
----------+------------
tuple | SIReadLock
(1 row)

postgres=*# -- locktype should be 'tuple'
SELECT pid, locktype, mode FROM pg_locks WHERE mode = 'SIReadLock';
pid | locktype | mode
------+----------+------------
2831 | tuple | SIReadLock
(1 row)

i thought one could attach a gdb or strace to the pid to figure out what it
did before crashing.

As always, I have little knowledge on postgresql, feel free to ignore if
nothing relevant.

Thanks,
Vijay

On Tue, 27 Apr 2021 at 19:55, Mike Beachy <mbeachy(at)gmail(dot)com> wrote:

> Hi Laurenz -
>
> On Tue, Apr 27, 2021 at 2:56 AM Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
> wrote:
>
>> Not sure, but do you see prepared transactions in "pg_prepared_xacts"?
>>
>
> No, the -1 in the virtualtransaction (
> https://www.postgresql.org/docs/11/view-pg-locks.html) for
> pg_prepared_xacts was another clue I saw! But, it seems more or less a dead
> end as I have nothing in pg_prepared_xacts.
>
> Thanks for the idea, though.
>
> I still need to put more effort into Tom's idea about SIReadLock hanging
> out after the transaction, but some evidence pointing in this direction is
> that I've reduced the number of db connections and found that the '-1/0'
> locks will eventually go away! I interpret this as the db needing to find
> time when no overlapping read/write transactions are present. This doesn't
> seem completely correct, as I don't have any long lived transactions
> running while these locks are hanging out. Confusion still remains, for
> sure.
>
> Mike

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2021-04-27 20:16:01 Re: very long secondary->primary switch time
Previous Message Tomas Pospisek 2021-04-27 17:15:03 very long secondary->primary switch time