Re: matview scannability rehash (was Re: Drastic performance loss in assert-enabled build in HEAD)

From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Grittner <kgrittn(at)ymail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: matview scannability rehash (was Re: Drastic performance loss in assert-enabled build in HEAD)
Date: 2013-04-05 11:59:04
Message-ID: 20130405115904.GA6164@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 04, 2013 at 06:07:17PM -0400, Tom Lane wrote:
> Noah Misch <noah(at)leadboat(dot)com> writes:
> > On Wed, Apr 03, 2013 at 05:49:18PM -0400, Tom Lane wrote:
> >> No. This is an absolute disaster. It's taking something we have always
> >> considered to be an irrelevant implementation detail and making it into
> >> user-visible DDL state, despite the fact that it doesn't begin to satisfy
> >> basic transactional behaviors. We *need* to get rid of that aspect of
> >> things. If you must have scannability state in version 0, okay, but
> >> it has to be a catalog property not this.
>
> > In large part, this ended up outside the catalogs due to key limitations of
> > the startup process.
>
> Yeah, I realize that there's no other (easy) way to make unlogged
> matviews reset to an invalid state on crash, but that doesn't make this
> design choice less of a disaster. It boxes us into something that's
> entirely unable to support transitions between scannable and unscannable
> states by any means short of a complete rewrite of the matview contents;
> which seems fundamentally incompatible with any sort of incremental
> update scenario.

I said:
> > [...] the relfilenode-based indicator
> > (whether the committed approach or something else) doesn't need to remain the
> > only input to the question of scannability. If 9.5 introduces the concept of
> > age-based scannability expiration, the applicable timestamp could go in
> > pg_class, and pg_relation_is_scannable() could check both that and the
> > relfilenode-based indicator.

To make that concrete, suppose we implement something you suggested upthread,
"ALTER MATERIALIZED VIEW foo SET (scannable = false)". Implementing that by
replacing the relfilenode with a single empty page or removing a _scannable
fork would indeed be all wrong. The former blows away data you may wish to
make reappear later, and the latter is non-transactional. But what has us do
it that way? Store the new indicator in reloptions, where it belongs, and
make pg_relation_is_scannable() test "relfilenode_says_scannable &&
reloptions_say_scannable".

> And I remain of the opinion that it's going to box us
> into not being able to fix the problems because of pg_upgrade
> on-disk-compatibility issues.

I said:
> > [a "_scannable" fork] has a few advantages over the current approach: VACUUM
> > won't need a special case, and pg_upgrade will be in a better position to blow
> > away all traces if we introduce a better approach.

pg_upgrade looks for _scannable forks, then translates their presence into the
new representation. The committed single-empty-page strategy doesn't have an
acute problem here, either, though.

> > A slight variation on the committed approach would be to add a "_scannable"
> > relation fork.
>
> Not very transaction-safe, I think (consider crash midway through a
> transaction that adds or removes the fork)

The SQL commands I cited as responsible for creating or removing the fork all
make a new relfilenode anyway. Thus, "add" actually means creating the fork
with the new relfilenode, and "remove" actually means omitting the fork from
the new relfilenode. The association between relfilenodes and relations is,
of course, transactional.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2013-04-05 12:08:24 Re: Hash Join cost estimates
Previous Message Matthias 2013-04-05 11:54:29 Re: Hash Join cost estimates