Re: PostgreSQL and a Catch-22 Issue related to dead rows

From: Lars Aksel Opsahl <Lars(dot)Opsahl(at)nibio(dot)no>
To: Greg Sabino Mullane <htamfids(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Christophe Pettus <xof(at)thebuild(dot)com>, "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: PostgreSQL and a Catch-22 Issue related to dead rows
Date: 2024-12-10 14:03:41
Message-ID: AM7P189MB1028DE31357B6B278FDEC3589D3D2@AM7P189MB1028.EURP189.PROD.OUTLOOK.COM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

________________________________
Sent: Tuesday, December 10, 2024 2:03 PM
To: Lars Aksel Opsahl <Lars(dot)Opsahl(at)nibio(dot)no>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; Christophe Pettus <xof(at)thebuild(dot)com>; pgsql-performance(at)lists(dot)postgresql(dot)org <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: PostgreSQL and a Catch-22 Issue related to dead rows

On Tue, Dec 10, 2024 at 3:55 AM Lars Aksel Opsahl <Lars(dot)Opsahl(at)nibio(dot)no<mailto:Lars(dot)Opsahl(at)nibio(dot)no>> wrote:
Is it difficult to add parameter like force-dead-rows-removal that we send to the vacuum job that will remove this rows like this ?
I'm still not sure what the ask here is - complete literal removal of the dead rows? That's not how PG works. I'm wondering if we are not in an XY problem. Your queries are slow, and you think it's because of autovacuum's output re dead rows. But let's take a step back and look at the actual queries being run that are slowing down. Perhaps there are other solutions: less indexing, more freezing, smarter updates, different partitioning, tweaking fillfactor, etc. etc. There are lots of things we can try that will be orders of magnitude simpler than trying to redesign MVCC/vacuuming. :)

Hi

Yes we can solve this with more divide and conquer working but it's complicating thing a lot. I basically do partitioning now by splitting Postgis Topology up many hundreds and sometime many thousands off different topologies, but there is a limit to how much I can split up and later merge because this also has a cost.

So the main issue seems to be related to dead rows. We have samples of queries like this 'SELECT node_id,geom FROM node WHERE containing_face = 0;' going from 2008.947 ms to 0.072 ms, when we did a commit on an unrelated job so xmin did not block removal off dead rows. Here is also some more info https://gitlab.com/nibioopensource/resolve-overlap-and-gap/-/issues/67 .Doing anlayze just before the query does not help, the only thing that seems help is get read off rows marked as "DETAIL: 195929 dead row versions cannot be removed yet, oldest xmin: 3475136501 "

Seen from the outside I am not asking for redesign 🙂just a parameter to test forced removal of dead rows when working on unnlogged tables and running vacuum from command line, but may seem to be more complicated than I was hopeing.

Thanks.

Lars

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Sabino Mullane 2024-12-10 15:31:37 Re: PostgreSQL and a Catch-22 Issue related to dead rows
Previous Message Greg Sabino Mullane 2024-12-10 13:03:28 Re: PostgreSQL and a Catch-22 Issue related to dead rows