From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [PATCH 03/16] Add a new syscache to fetch a pg_class entry via its relfilenode |
Date: | 2012-06-14 18:50:51 |
Message-ID: | CA+Tgmobs1DcPAY=-szCeRiGwe-hn9ADBjZXxy_S48jDLLD0v9A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jun 13, 2012 at 7:28 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> This patch is problematic because formally indexes used by syscaches needs to
> be unique, this one is not though because of 0/InvalidOids entries for
> nailed/shared catalog entries. Those values aren't allowed to be queried though.
That's not the only reason it's not unique. Take a look at
GetRelFileNode(). We really only guarantee that <database OID,
tablespace OID, relfilenode, backend-ID>, taken as a four-tuple, is
unique. You could have the same relfilenode in different tablespaces,
or even within the same tablespace with different backend-IDs. The
latter might not matter for you because you're presumably disregarding
temp tables, but the former probably does. It's an uncommon scenario
because we normally set relid = relfilenode, and of course relid is
unique across the database, but the table gets rewritten then you end
up with relid != relfilenode, and I don't think there's anything at
that point that will prevent the new relfilenode from being chosen as
some other relations relfilenode, as long as it's in a different
tablespace.
I think the solution may be to create a specialized cache for this,
rather than relying on the general syscache infrastructure. You might
look at, e.g., attoptcache.c for an example. That would allow you to
build a cache that is aware of things like the relmapper
infrastructure, and the fact that temp tables are ignorable for your
purposes. But I think you will need to include at least the
tablespace OID in the key along with the relfilenode to make it
bullet-proof.
I haven't read through the patch series far enough to know what this
is being used for yet, but my fear is that you're using it to handle
mapping a relflenode extracted from the WAL stream back to a relation
OID. The problem with that is that relfilenode assignments obey
transaction semantics. So, if someone begins a transaction, truncates
a table, inserts a tuple, and commits, the heap_insert record is going
to refer to a relfilenode that, according to the system catalogs,
doesn't exist. This is similar to one of the worries in my other
email, so I won't belabor the point too much more here...
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2012-06-14 19:10:02 | Re: Allow WAL information to recover corrupted pg_controldata |
Previous Message | Tom Lane | 2012-06-14 18:43:04 | Re: libpq compression |