Re: postgres materialized view refresh performance

From: Philip Semanchuk <philip(at)americanefficient(dot)com>
To: Ayub M <hiayub(at)gmail(dot)com>
Cc: PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: postgres materialized view refresh performance
Date: 2020-10-23 13:01:03
Message-ID: 2626DFA9-3446-49D7-A8D7-31EC4863E7D7@americanefficient.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> On Oct 22, 2020, at 3:53 PM, Ayub M <hiayub(at)gmail(dot)com> wrote:
>
> There is a table t which is used in a mview mv, this is the only table in the mview definition.
>
> create table t (c1 int, ..., c10 int
> );
>
> -- there is a pk on say c1 column
> create materialized view mv as select c1, c2...c10 from
> t;
>
> ---there is a unique index on say c5 and bunch of other indexes on the mview.
> The reason there is a mview created instead of using table t, is that that the table gets truncated and reloaded every couple of hours and we don't want users to see an empty table at any point of time that's why mview is being used.
>
> Using "refresh materialized view concurrently", this mview is being used by APIs and end users.
>
> Couple of questions I have -

Hi Ayub,
I’m not an expert on the subject; I hope you’ll get an answer from someone who is. :-) Until then, my answers might help.

> • Whenever mview refresh concurrently happens, does pg create another set of table and indexes and switch it with the orig? If no, then does it update the existing data?

My understanding is that when CONCURRENTLY is specified, Postgres implements the refresh as a series of INSERT, UPDATE, and DELETE statements on the existing view. So the answer to your question is no, Postgres doesn’t create another table and then swap it.

> • The mview gets refreshed in a couple of mins sometimes and sometimes it takes hours. When it runs for longer, there are no locks and no resource shortage, the number of recs in the base table is 6m (7.5gb) which is not huge so why does it take so long to refresh the mview?

Does the run time correlate with the number of changes being made?

> • Does mview need vacuum/analyze/reindex?

My understanding is that when CONCURRENTLY is specified, yes it does need vacuuming, because of the aforementioned implementation of REFRESH as a series of INSERT, UPDATE, and DELETE statements.

In our situation, we have large views that are refreshed once per week. We want to ensure that the view is in the best possible shape for users, so we create the view with autovacuum_enabled = false and then run an explicit vacuum/analyze step immediately after the refresh rather than leaving it to chance.

Cheers
Philip

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message PALAYRET Jacques 2020-10-23 13:34:48 Conditional column filtering with pglogical replication
Previous Message Michael Paquier 2020-10-23 03:19:59 Re: PG 9.2 slave restarted - cache not impacted