Quick Links

Re: Possible Bug in pg_upgrade

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Dave Byrne <dbyrne(at)mdb(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Possible Bug in pg_upgrade
Date:	2011-08-15 03:27:55
Message-ID:	201108150327.p7F3RtM21294@momjian.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Dave Byrne wrote:
> Beginning with commit 002c105a0706bd1c1e939fe0f47ecdceeae6c52d
> pg_upgrade will fail if there are orphaned temp tables in the current
> database with the message 'old and new databases "postgres" have a
> different number of relations'
>
> On line 41 of pg_upgrade/info.c pg_upgrade checks that the number of
> relations are the same but includes orphaned temp tables in the comparison.
>
> Is this expected behavior? If so is there a more detailed error message
> that can be added explain the cause of the failure? It wasn't evident
> until modified pg_upgrade to show the missing oid's and recognized them
> as temp tables with oid2name.

Thanks for your report and patch.

Let me give some background on pg_upgrade to explain what is happening.
Pg_upgrade uses two C arrays to store information about tables and
indexes for the old and new clusters. It is not possible to store this
information in a database because both clusters are down when pg_upgrade
needs to use this information.

In pre-9.1 pg_upgrade, pg_upgrade did a sequential scan of the arrays
looking for a match between old and new cluster objects. This was
reported as slow for databases with many objects, and I could reproduce
the slowness. I added some caching in 9.0 but the real solution for 9.1
was to assume a one-to-one mapping between the old and new C arrays,
i.e. the 5th entry in the old cluster array is the same as the 5th
element in the new cluster array.

I knew this was risky but was the right solution so it doesn't surprise
me you found a failure. pg_upgrade checks that the size of the two
arrays in the same and also checks that each element matches --- the
former is what generated your error.

Now, about the cause. I had not anticipated that orphaned temp objects
could exist in either cluster. In fact, this case would have generated
an error in 9.0 as well, but with a different error message.

Looking futher, pg_upgrade has to use the same object filter as
pg_dump, and pg_dump uses this C test:

pg_dump.c: else if (strncmp(nsinfo->dobj.name, "pg_", 3) == 0 ||

pg_dumpall uses this filter:

"WHERE spcname !~ '^pg_' "

The problem is that the filter used by pg_upgrade only excluded
pg_catalog, not pg_temp* as well.

I have developed the attached two patches, one for 9.0, and the second
for 9.1 and 9.2 which will make pg_upgrade now match the pg_dump
filtering and produce proper results for orphaned temp tables by
filtering them.

As far as unlogged tables, those are dumped by 9.1/9.2, so there is no
need to check relpersistence in this patch. pg_dump doesn't check
relistemp either.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Attachment	Content-Type	Size
/rtmp/pg_upgrade.9.0	text/x-diff	4.5 KB
/rtmp/pg_upgrade.9.1	text/x-diff	4.6 KB

In response to

Possible Bug in pg_upgrade at 2011-08-10 18:47:24 from Dave Byrne

Responses

Re: Possible Bug in pg_upgrade at 2011-08-16 02:39:38 from Bruce Momjian

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	daveg	2011-08-15 07:27:36	Re: error: could not find pg_class tuple for index 2662
Previous Message	Joachim Wieland	2011-08-15 01:31:20	synchronized snapshots