Re: Vacuum Full Analyze Stalled

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Jeff Kirby <Jeff(dot)Kirby(at)wicourts(dot)gov>, pgsql-admin(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Vacuum Full Analyze Stalled
Date: 2005-10-04 01:54:51
Message-ID: 983.1128390891@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-hackers

[ I just noticed that this thread is happening on pgsql-admin, which is
completely inappropriate for discussing bugs in a beta version.
Please redirect followups to pgsql-hackers. ]

I wrote:
> ... The hypothesis I'm thinking about is that VACUUM is trying to do
> LockBufferForCleanup() and for some reason it never finishes.

I set up a simple-minded reproduction of Kevin's situation: I did

create domain dint as int check (value > 0);
create table manyd (f1 dint, f2 dint, f3 dint,
f4 dint, f5 dint, f6 dint, f7 dint, f8 dint, f9 dint, f10 dint);

and then ran ten concurrent clients doing this continuously:

insert into manyd values(1,2,3,4,5,6,7,8,9,10);

which should be enough to force a lot of indexscans on
pg_constraint_contypid_index. I added an additional client doing

create domain d1 as int check (value > 0);
drop domain d1;

to ensure that there were dead rows needing vacuuming in pg_constraint.
(BTW, Tatsuo's new version of pg_bench lets me do all this without
writing a line of code...)

Finally, I added some debug printouts to LockBufferForCleanup so I
could see if it was being executed or not.

Then I tried both manual and autovacuum-driven vacuums of pg_constraint.
I was able to see from the debug printouts that LockBufferForCleanup was
sometimes forced to wait in both cases. But it never got "stuck".

This eliminates one thing I was worrying about, which was the
possibility that the LockBufferForCleanup waiting path was completely
broken inside autovacuum for some reason. But it doesn't get us a whole
lot closer to a solution.

At this point I think we need more info from Kevin and Jeff before we
can go further. There must be some additional special feature of their
application that makes the problem appear, but what?

A stack trace of the stuck process would definitely help...

regards, tom lane

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Jeff Frost 2005-10-04 03:00:48 Re: archive_command
Previous Message Kevin Grittner 2005-10-03 22:45:39 Re: Vacuum Full Analyze Stalled

Browse pgsql-hackers by date

  From Date Subject
Next Message Philip Yarra 2005-10-04 02:06:45 Re: RPMs for RedHat ES3.0
Previous Message Bruce Momjian 2005-10-04 01:38:06 Re: [COMMITTERS] pgsql: Fix procedure for updating nextval()