Re: PostgreSQL and a Catch-22 Issue related to dead rows

From: Lars Aksel Opsahl <Lars(dot)Opsahl(at)nibio(dot)no>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Christophe Pettus <xof(at)thebuild(dot)com>
Cc: "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: PostgreSQL and a Catch-22 Issue related to dead rows
Date: 2024-12-10 08:55:00
Message-ID: AM7P189MB102815FBBAC696A3FE173CEB9D3D2@AM7P189MB1028.EURP189.PROD.OUTLOOK.COM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

________________________________
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Sent: Monday, December 9, 2024 5:07 PM

We do only very coarse-grained analysis of whether a row is "dead".
In principle, if vacuum had access to all the live snapshots of
all sessions, it could realize that a row really is dead even though
it's later than the current global xmin horizon. But discovering that
would be quite difficult and therefore expensive. Notably, sessions
would have to expose far more of their snapshot state than they do
today, and there would have to be interlocks to allow other sessions
to inspect that state safely, and that'd probably put us into much the
same sort of too-many-lock-conflicts problem that the OP has already.

I don't think there's any free lunch here. Maybe there's some
other compromise between amount-of-state-exposed versus
dead-row-discoverability, but finding a better way would take
a great deal of creative effort and testing.

regards, tom lane

Hi again.

On my bike to work this morning I was thinking more about this.

When the application run vacuum I find this "7954046 are dead but not yet removable removable" so the job knows what rows that are involved.

Is it difficult to add parameter like force-dead-rows-removal that we send to the vacuum job that will remove this rows like this ?

Just to test what happens and also how it affects performance.

The tables I work on are unlogged , vacuum command is controlled by the application. If this option is only applicable on unlogged tables which are unsafe anyway and from command line , maybe that can help to reduce complexity and side effects.

I am now testing on PostgreSQL "16.4 (Ubuntu 16.4-1.pgdg22.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit"

Thanks.

Lars

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message David Mullineux 2024-12-10 09:22:37 Re: can a blocked transaction affect the performance of one that is blocking it?
Previous Message Lars Aksel Opsahl 2024-12-10 05:32:44 Re: PostgreSQL and a Catch-22 Issue related to dead rows