| From: | Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> | 
|---|---|
| To: | pgsql-hackers(at)postgresql(dot)org | 
| Subject: | Re: Implementing Incremental View Maintenance | 
| Date: | 2019-05-14 06:46:48 | 
| Message-ID: | 20190514154648.63fea174748c18ed7f7dcb16@sraoss.co.jp | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Mon, 1 Apr 2019 12:11:22 +0900
Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> wrote:
> On Thu, 27 Dec 2018 21:57:26 +0900
> Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> wrote:
> 
> > Hi,
> > 
> > I would like to implement Incremental View Maintenance (IVM) on PostgreSQL.  
> 
> I am now working on an initial patch for implementing IVM on PostgreSQL.
> This enables materialized views to be updated incrementally after one
> of their base tables is modified.
Attached is a WIP patch of Incremental View Maintenance (IVM).
Major part is written by me, and changes in syntax and pg_class
are Hoshiai-san's work.
Although this is sill a draft patch in work-in-progress, any
suggestions or thoughts would be appreciated.
 
* What it is
This allows a kind of Immediate Maintenance of materialized views. if a 
materialized view is created by CRATE INCREMENTAL MATERIALIZED VIEW command,
the contents of the mateview is updated automatically and incrementally
after base tables are updated. Noted this syntax is just tentative, so it
may be changed.
====== Example 1 ======
postgres=# CREATE INCREMENTAL MATERIALIZED VIEW m AS SELECT * FROM t0;
SELECT 3
postgres=# SELECT * FROM m;
 i 
---
 3
 2
 1
(3 rows)
postgres=# INSERT INTO t0 VALUES (4);
INSERT 0 1
postgres=# SELECt * FROM m; -- automatically updated
 i 
---
 3
 2
 1
 4
(4 rows)
=============================
This implementation also supports matviews including duplicate tuples or
DISTINCT clause in its view definition query.  For example, even if a matview
is defined with DISTINCT to remove duplication of tuples in a base table, this
can perform incremental update of the matview properly. That is, the contents
of the matview doesn't change when exiting tuples are inserted into the base
tables, and a tuple in the matview is deleted only when duplicity of the
corresponding tuple in the base table becomes zero.
This is due to "colunting alogorithm" in which the number of each tuple is
stored in matviews as a special column value.
====== Example 2 ======
postgres=# SELECT * FROM t1;
 id | t 
----+---
  1 | A
  2 | B
  3 | C
  4 | A
(4 rows)
postgres=# CREATE INCREMENTAL MATERIALIZED VIEW m1 AS SELECT t FROM t1;
SELECT 3
postgres=# CREATE INCREMENTAL MATERIALIZED VIEW m2 AS SELECT DISTINCT t FROM t1;
SELECT 3
postgres=# SELECT * FROM m1; -- with duplicity
 t 
---
 A
 A
 C
 B
(4 rows)
postgres=# SELECT * FROM m2;
 t 
---
 A
 B
 C
(3 rows)
postgres=# INSERT INTO t1 VALUES (5, 'B');
INSERT 0 1
postgres=# DELETE FROM t1 WHERE id IN (1,3); -- delete (1,A),(3,C)
DELETE 2
postgres=# SELECT * FROM m1; -- one A left and one more B
 t 
---
 B
 B
 A
(3 rows)
postgres=# SELECT * FROM m2; -- only C is removed
 t 
---
 B
 A
(2 rows)
=============================
* How it works
1. Creating matview
When a matview is created, AFTER triggers are internally created
on its base tables. When the base tables is modified (INSERT, DELETE,
UPDATE), the matview is updated incrementally in the trigger function.
When populating the matview, GROUP BY and count(*) are added to the
view definition query before this is executed for counting duplicity
of tuples in the matview. The result of count is stored in the matview
as a special column named "__ivm_count__". 
2. Maintenance of matview
When base tables are modified, the change set of the table can be
referred as Ephemeral Named Relations (ENRs) thanks to Transition Table
(a feature of trigger implemented since PG10). We can calculate the diff
set of the matview by replacing the base table in the view definition
query with the ENR (at least if it is Selection-Projection -Join view). 
As well as view definition time, GROUP BY and count(*) is added in order
to count the duplicity of tuples in the diff set. As a result, two diff
sets (to be deleted from and to be inserted into the matview) are
calculated, and the results are stored into temporary tables respectively.
The matiview is updated by merging these change sets. Instead of executing
DELETE or INSERT simply, the values of __ivm_count__ column in the matview
is decreased or increased. When the values becomes zero, the corresponding
tuple is deleted from the matview.
3. Access to matview
When SELECT is issued for IVM matviews defined with DISTINCT, all columns
except to __ivm_count__ of each tuple in the matview are returned.  This is 
correct because duplicity of tuples are eliminated by GROUP BY.
When DISTINCT is not used, SELECT for the IVM matviews returns each tuple
__ivm_count__ times. Currently, this is implemented by rewriting the SELECT
query to replace the matview RTE with a subquery which joins the matview
and generate_series function as bellow. 
SELECT mv.* FROM mv, generate_series(1, mv.__ivm_count__);
__ivm_count__ column is invisible for users when "SELECT * FROM ..." is
issued, but users can see the value by specifying in target list explicitly.
====== Example 3 ======
postgres=# \d+ m1
                                  Materialized view "public.m1"
    Column     |  Type  | Collation | Nullable | Default | Storage  | Stats target | Description 
---------------+--------+-----------+----------+---------+----------+--------------+-------------
 t             | text   |           |          |         | extended |              | 
 __ivm_count__ | bigint |           |          |         | plain    |              | 
View definition:
 SELECT t1.t
   FROM t1;
Access method: heap
postgres=# \d+ m2
                                  Materialized view "public.m2"
    Column     |  Type  | Collation | Nullable | Default | Storage  | Stats target | Description 
---------------+--------+-----------+----------+---------+----------+--------------+-------------
 t             | text   |           |          |         | extended |              | 
 __ivm_count__ | bigint |           |          |         | plain    |              | 
View definition:
 SELECT DISTINCT t1.t
   FROM t1;
Access method: heap
postgres=# SELECT *, __ivm_count__ FROM m1;
 t | __ivm_count__ 
---+---------------
 B |             2
 B |             2
 A |             1
(3 rows)
postgres=# SELECT *, __ivm_count__ FROM m2;
 t | __ivm_count__ 
---+---------------
 B |             2
 A |             1
(2 rows)
postgres=# EXPLAIN SELECT * FROM m1;
                                  QUERY PLAN                                  
------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..61.03 rows=3000 width=2)
   ->  Seq Scan on m1 mv  (cost=0.00..1.03 rows=3 width=10)
   ->  Function Scan on generate_series  (cost=0.00..10.00 rows=1000 width=0)
(3 rows)
=============================
* Simple Performance Evaluation
I confirmed that "incremental" update of matviews is more effective
than the standard REFRESH by using simple exapmle.  I used tables
of pgbench (SF=100) here.
Create two matviews, that is, without and with IVM.
test=# CREATE MATERIALIZED VIEW bench1 AS
        SELECT aid, bid, abalance, bbalance
        FROM pgbench_accounts JOIN pgbench_branches USING (bid)
        WHERE abalance > 0 OR bbalance > 0;
SELECT 5001054
test=# CREATE INCREMENTAL MATERIALIZED VIEW bench2 AS
        SELECT aid, bid, abalance, bbalance
        FROM pgbench_accounts JOIN pgbench_branches USING (bid)
        WHERE abalance > 0 OR bbalance > 0;
SELECT 5001054
The standard REFRESH of bench1 took more than 10 seconds.
test=# \timing 
Timing is on.
test=# REFRESH MATERIALIZED VIEW bench1 ;
REFRESH MATERIALIZED VIEW
Time: 11210.563 ms (00:11.211)
Create an index on the IVM matview (bench2).
test=# CREATE INDEX on bench2(aid,bid);
CREATE INDEX
Updating a tuple in pgbench_accounts took 18ms. After this, bench2
was updated automatically and correctly.
test=# SELECT * FROM bench2 WHERE aid = 1;
 aid | bid | abalance | bbalance 
-----+-----+----------+----------
   1 |   1 |       10 |       10
(1 row)
Time: 2.498 ms
test=# UPDATE pgbench_accounts SET abalance = 1000 WHERE aid = 1;
UPDATE 1
Time: 18.634 ms
test=# SELECT * FROM bench2 WHERE aid = 1;
 aid | bid | abalance | bbalance 
-----+-----+----------+----------
   1 |   1 |     1000 |       10
(1 row)
However, if there is not the index on bench2, it took 4 sec, so
appropriate indexes are needed on IVM matviews.
test=# DROP INDEX bench2_aid_bid_idx ;
DROP INDEX
Time: 10.613 ms
test=# UPDATE pgbench_accounts SET abalance = 2000 WHERE aid = 1;
UPDATE 1
Time: 3931.274 ms (00:03.931)
* Restrictions on view definition
This patch is still in Work-in-Progress and there are many restrictions
on the view definition query of matviews.
The current implementation supports views including selection, projection,
and inner join with or without DISTINCT. Aggregation and GROUP BY are not
supported yet, but I plan to deal with these by the first release. 
Self-join, subqueries, OUTER JOIN, CTE, window functions are not
considered well, either. I need more investigation on these type of views
although I found some papers explaining how to handle sub-queries and
outer-joins. 
These unsupported views should be checked when a matview is created, but
this is not implemented yet.  Hoshiai-san are working on this.
* Timing of view maintenance
This patch implements a kind of Immediate Maintenance, that is, a matview
is updated immediately when a base table is modified.  On other hand, in
"Deferred Maintenance", matviews are updated after the transaction, for
example, by the user command like REFRESH. 
For implementing "deferred", it is need to implement a mechanism to maintain
logs for recording changes of base tables and an algorithm to compute the
delta to be applied to matviews. 
 
In addition, there could be another implementation of Immediate Maintenance
in which matview is updated at the end of a transaction that modified base
table, rather than in AFTER trigger. Oracle supports this type of IVM. To
implement this, we will need a mechanism to maintain change logs on base
tables as well as Deferred maintenance.
* Counting algorithm implementation
There will be also discussions on counting-algorithm implementation.
Firstly, the current patch treats "__ivm_count__" as a special column name
in a somewhat ad hoc way. This is used when maintaining and accessing matviews,
and when "SELECT * FROM ..." is issued,  __ivm_count__ column is invisible for
users.  Maybe this name has to be inhibited in user tables. Is it acceptable
to use such columns for IVM, and is there better way, if not?
Secondly, a matview with duplicate tuples is replaces with a subquery which
uses generate_series function. It does not have to be generate_series, and we
can make a new set returning function for this. Anyway, this internal behaviour
is visible in EXPLAIN results as shown in Example 3. Also, there is a
performance impact because   estimated rows number is wrong, and what is worse,
the cost of join is not small when the size of matview is large. Therefore, we
might have to add a new plan node for selecting from matviews rather than using
such a special set returning function.
Ragards,
-- 
Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
| Attachment | Content-Type | Size | 
|---|---|---|
| WIP_immediate_IVM.patch | text/x-diff | 44.3 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Paquier | 2019-05-14 06:49:50 | Re: Caveats from reloption toast_tuple_target | 
| Previous Message | Andres Freund | 2019-05-14 06:19:53 | Re: [HACKERS] Unlogged tables cleanup |