Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Timothy Garnett <tgarnett(at)panjiva(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>, Kevin Grittner <kgrittn(at)ymail(dot)com>
Subject: Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)
Date: 2015-04-30 03:52:51
Message-ID: CAA4eK1+_pXTZZ837TdAnFmhdXLQTM3oxPyCmUo1VQcWOygn3CQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Apr 28, 2015 at 11:24 PM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
wrote:
>
> Alvaro Herrera wrote:
>
>
> Pushed. I chose find_multixact_start() as a name for this function.
>

I have done test to ensure that the latest change has fixed the
reported problem and below are the results, to me it looks the
reported problem is fixed.

I have used test (explode_mxact_members) developed by Thomas
to reproduce the problem. Start one transaction in a session.
After running the test for 3~4 hours with parameters as
explode_mxact_members 500 35000, I could see the warning messages
like below (before the fix there were no such messages and test is
completed but it has corrupted the database):

WARNING: database with OID 1 must be vacuumed before 358 more multixact
members are used
HINT: Execute a database-wide VACUUM in that database, with reduced
vacuum_multixact_freeze_min_age and
vacuum_multixact_freeze_table_age settings.
WARNING: database with OID 1 must be vacuumed before 310 more multixact
members are used
HINT: Execute a database-wide VACUUM in that database, with reduced
vacuum_multixact_freeze_min_age and
vacuum_multixact_freeze_table_age settings.
WARNING: database with OID 1 must be vacuumed before 261 more multixact
members are used
HINT: Execute a database-wide VACUUM in that database, with reduced
vacuum_multixact_freeze_min_age and
vacuum_multixact_freeze_table_age settings.
WARNING: database with OID 1 must be vacuumed before 211 more multixact
members are used
HINT: Execute a database-wide VACUUM in that database, with reduced
vacuum_multixact_freeze_min_age and
vacuum_multixact_freeze_table_age settings.
WARNING: database with OID 1 must be vacuumed before 160 more multixact
members are used
HINT: Execute a database-wide VACUUM in that database, with reduced
vacuum_multixact_freeze_min_age and
vacuum_multixact_freeze_table_age settings.
explode_mxact_members: explode_mxact_members.c:38: main: Assertion
`PQresultStatus(res) == PGRES_TUPLES_OK'
failed.

After this I set the vacuum_multixact_freeze_min_age and
vacuum_multixact_freeze_table_age as zero and then performed
Vacuum freeze for template1 and postgres followed by
manual CHECKPOINT. I could see below values in pg_database.

postgres=# select oid,datname,datminmxid from pg_database;
oid | datname | datminmxid
-------+-----------+------------
1 | template1 | 17111262
13369 | template0 | 17111262
13374 | postgres | 17111262
(3 rows)

Again I start the test as ./explode_mxact_members 500 35000, but it
immediately failed as
500 sessions connected...
Loop 0...
WARNING: database with OID 13369 must be vacuumed before 12 more multixact
members are used
HINT: Execute a database-wide VACUUM in that database, with reduced
vacuum_multixact_freeze_min_age and
vacuum_multixact_freeze_table_age settings.
WARNING: database with OID 13369 must be vacuumed before 11 more multixact
members are used
HINT: Execute a database-wide VACUUM in that database, with reduced
vacuum_multixact_freeze_min_age and
vacuum_multixact_freeze_table_age settings.
WARNING: database with OID 13369 must be vacuumed before 9 more multixact
members are used
HINT: Execute a database-wide VACUUM in that database, with reduced
vacuum_multixact_freeze_min_age and
vacuum_multixact_freeze_table_age settings.
explode_mxact_members: explode_mxact_members.c:38: main: Assertion
`PQresultStatus(res) == PGRES_TUPLES_OK'
failed.

Now it was confusing for me why it has failed for next time even
though I had Vacuum Freeze and CHECKPOINT, but then I waited
for a minute or two and ran Vacuum Freeze by below command:
./vacuumdb -a -F
vacuumdb: vacuuming database "postgres"
vacuumdb: vacuuming database "template1"

Here I have verified that all files except one were deleted.

After that when I restarted the test, it went perfectly fine and it never
lead to any warning messages, probable because the values for
vacuum_multixact_freeze_min_age and vacuum_multixact_freeze_table_age
were zero.

I am still not sure why it took some time to clean the members directory
and resume the test after running Vacuum Freeze and Checkpoint.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Thomas Munro 2015-04-30 05:17:58 Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)
Previous Message Georgi Georgiev 2015-04-30 01:19:31 Re: BUG #13182: pgadmin3 not showing in application list