Re: Mat view sometimes taking 10x the time to refresh concurrently

From: Vijaykumar Jain <vijaykumarjain(dot)github(at)gmail(dot)com>
To: Wells Oliver <wells(dot)oliver(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Mat view sometimes taking 10x the time to refresh concurrently
Date: 2021-05-27 17:05:50
Message-ID: CAM+6J95d2xdy_4oYUVhjCACBAxCNL0=HWzHgC-h1g3RhM98T4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

" although I'd think that a backup vs. a
concurrent refresh shouldn't have that sort of problem"

yep. I guess pg_dump requests for access share lock iirc.

*****************************************
session 1:
demo=# create table t(id int, name text);
CREATE TABLE
demo=# insert into t select x, md5(x::text) from generate_series(1, 1000) x;
INSERT 0 1000
demo=# CREATE MATERIALIZED VIEW tv as select id,name from t,pg_sleep(5)
where id > 10;
SELECT 990
demo=# create unique INDEX on tv(id);
CREATE INDEX
demo=# refresh materialized view concurrently tv;
REFRESH MATERIALIZED VIEW
demo=# refresh materialized view concurrently tv;
REFRESH MATERIALIZED VIEW
demo=# refresh materialized view concurrently tv;
REFRESH MATERIALIZED VIEW

session2:
postgres(at)go:/tmp$ pg_dump demo > abc.sql # while refresh running.
completes fine.
postgres(at)go:/tmp$ psql
postgres=# \c demo
You are now connected to database "demo" as user "postgres".
demo=# begin;
BEGIN
demo=*# lock table t IN ACCESS SHARE MODE; -- simulate pg_dump
LOCK TABLE
demo=*# select * from tv limit 1; -- also just query the existing view
id | name
----+----------------------------------
11 | 6512bd43d9caa6e02c990b0a82652dca
(1 row)

***********************
so, pg_dump may not be the problem. if i simulated correctly. ( ofcourse
without any exclusive locks on base table t)

On Thu, 27 May 2021 at 22:08, Wells Oliver <wells(dot)oliver(at)gmail(dot)com> wrote:

> Thanks, Tom. I will dig into those ideas. I do think the I/O capacity
> might ultimately be a big factor.
>
> On Thu, May 27, 2021 at 9:36 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> Wells Oliver <wells(dot)oliver(at)gmail(dot)com> writes:
>> > Hey, we have a mat view that normally takes ~18m to re-materialize
>> > concurrently, which we do daily. Maybe once a week, it takes ~180m to
>> > refresh, and we're at a loss as to why.
>> > We are running backs during the same general time window, where we do
>> > backup the schema where this mat view is located, could this be an
>> issue?
>> > We have not noticed any lock errors in the backup or log, though.
>> > Any tips on tracing this down would be appreciated.
>>
>> My own mindset would be to wonder if a different/worse plan is being
>> chosen. You could investigate that perhaps by running an EXPLAIN
>> on the matview's query just before each refresh, to see if it changes.
>>
>> Checking pg_locks for ungranted locks while the REFRESH is running
>> would be good to do too, although I'd think that a backup vs. a
>> concurrent refresh shouldn't have that sort of problem.
>>
>> It seems possible also that you're just maxing out the machine's
>> I/O capacity between these two tasks.
>>
>> regards, tom lane
>>
>
>
> --
> Wells Oliver
> wells(dot)oliver(at)gmail(dot)com <wellsoliver(at)gmail(dot)com>
>

--
Thanks,
Vijay
Mumbai, India

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Wells Oliver 2021-05-27 21:49:26 Migrating local PG instance to AWS RDS?
Previous Message Tom Lane 2021-05-27 16:43:56 Re: now() and statement_timestamp()