Re: Materialized views WIP patch

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Ants Aasma <ants(at)cybertec(dot)at>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2013-03-05 21:51:52
Message-ID: 1362520312.80777.YahooMailNeo@web162903.mail.bf1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> All that having been said, it's hard for me to imagine that
> anyone really cares about any of this until we have an
> incremental update feature, which right now we don't.

These are actually independent of one another, as long as we nail
down how we're determining "freshness" -- which is probably needed
for either.  Someone who's immersed in tuning long-running DW
queries might be interested in this before incremental update.
(They might load the data once per month, so refreshing the MVs as
a step in that process might be cheaper than incrementally
maintaining them.)  Someone could base "freshness" on
pg_relation_is_scannable() and start working on automatic query
rewrite right now, if they wanted to.

> Actually, I'm betting that's going to be significantly harder
> than automatic-query-rewrite, when all is said and done.
> Automatic-query-rewrite, if and when we get it, will not be easy
> and will require a bunch of work from someone with a good
> understanding of the planner, but it strikes me as the sort of
> thing that might work out to one large project and then it's
> done.

I still think we're going to hit the wall on planning time under
certain circumstances and need to tweak that over the course of
several releases, but now is not the time to get into the details
of why I think that.  We've spent way too much time on it already
for the point we're at in the 9.3 cycle.  I've kept my concerns
hand-wavy on purpose, and am trying hard to resist the temptation
to spend a lot of time demonstrating the problems.

> Whereas, incremental update sounds to me like a series of
> projects over a series of releases targeting various special
> cases, where we can always point to some improvements vs. release
> N-1 but we're never actually done

Exactly.  I predict that we will eventually have some special sort
of trigger for maintaining MVs based on base table changes to
handle the ones that are just too expensive (in developer time or
run time) to fully automate.  But there is a lot of low-hanging
fruit for automation.

> Even a reasonably simplistic and partial implementation of
> incremental update will benefit a lot of users.

Agreed.

> But in terms of relative difficulty, it's not at all obvious to
> me that that's the easier part of the project.

I totally agree that getting something working to use MVs in place
of underlying tables is not all that different or more difficult
than using partial indexes.  I just predict that we'll get a lot of
complaints about cases where it results in worse performance and
we'll need to deal with those issues.  I don't seem that as being
brain surgery; just a messy matter of trying to get this pretty
theory to work well in the real world -- probably using a bunch of
not-so-elegant heuristics.  And in the end, the best you can hope
for is performance not noticeably worse than you would get if you
modified your query to explicitly use the MV(s) -- you're just
saving yourself the rewrite.  Well, OK, there is the point that,
(like indexes) if you run the query which hits the base tables with
different parameters, and a new plan is generated each time, it
might pick different MVs or exclude them as is most efficient for
the given parameters.  That's the Holy Grail of all this.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Nicolas Barbier 2013-03-05 21:58:44 Re: Materialized views WIP patch
Previous Message Kevin Grittner 2013-03-05 21:09:08 Re: Materialized views WIP patch

Browse pgsql-hackers by date

  From Date Subject
Next Message Nicolas Barbier 2013-03-05 21:58:44 Re: Materialized views WIP patch
Previous Message Alvaro Herrera 2013-03-05 21:42:18 Re: sql_drop Event Trigger