Docs for refresh materialized view concurrently

From: Jeremy Finzel <finzelj(at)gmail(dot)com>
To: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Docs for refresh materialized view concurrently
Date: 2019-05-31 21:41:14
Message-ID: CAMa1XUi2nbHVVa6PrFTygpwPY1b_FYsJc6pywj3qPHoLdZ03_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Speaking with Robert today at pgcon, I happily discovered that REFRESH
MATERIALIZED VIEW CONCURRENTLY actually only updates rows that have changed
since the last refresh, rather than rewriting every row. In my curiosity,
I went to the docs, and found that this detail is not mentioned anywhere.

This is a great feature that is being undersold, and it should be made
clear in the docs.

In my experience, there can be tons of WAL generated from large
materialized views and the normal REFRESH (without CONCURRENTLY). I had
assumed the only benefit of CONCURRENTLY was to allow concurrent access to
the table. But actually the incremental refresh is a much bigger win for
us in reducing WAL overhead drastically.

I've not submitted a patch before, and have a few suggestions I'd like
feedback on before I write one (for the docs only).

1.

First, even this summary looks untrue:

REFRESH MATERIALIZED VIEW — replace the contents of a materialized view.

"replace" is not really accurate with the CONCURRENTLY option, because in
fact it only updates changed rows.

Perhaps instead of "replace":

- "replace or incrementally update the contents of a materialized view".

Also, the Description part has the same inaccuracy:

"completely replaces the contents of a materialized view.....The old
contents are discarded."

That is not true with CONCURRENTLY, correct? Only the old contents *which
have changed* are discarded.

2.

Lastly, I would suggest adding something like the following to the first
paragraph under CONCURRENTLY:

- With this option, only actual changed rows are updated in the
materialized view, which can significantly reduce the amount of write churn
and WAL traffic from a refresh if only a small number of rows will change
with each refresh. It is recommended to have a unique index on the
materialized view if possible, which will improve the performance of a
concurrent refresh.

Please correct me if my understanding of this is not right.

3.

On a different note, none of the documentation on materialized views notes
that they can only be LOGGED. This should be noted, or at least it should
be noted that one cannot create an UNLOGGED materialized view in the same
place it says that one cannot create a temporary one (under Description in
CREATE MATERIALIZED VIEW).

Thanks!
Jeremy Finzel

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Floris Van Nee 2019-06-01 04:01:38 Re: Index Skip Scan
Previous Message Tom Lane 2019-05-31 21:39:44 Re: compiling PL/pgSQL plugin with C++