From: | Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> |
---|---|
To: | Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Implementing Incremental View Maintenance |
Date: | 2019-06-20 07:44:10 |
Message-ID: | 20190620164410.0218e26e0bdedb7574dec2eb@sraoss.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi hackers,
Thank you for your many questions and feedbacks at PGCon 2019.
Attached is the patch rebased for the current master branch.
Regards,
Yugo Nagata
On Tue, 14 May 2019 15:46:48 +0900
Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> wrote:
> On Mon, 1 Apr 2019 12:11:22 +0900
> Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> wrote:
>
> > On Thu, 27 Dec 2018 21:57:26 +0900
> > Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> wrote:
> >
> > > Hi,
> > >
> > > I would like to implement Incremental View Maintenance (IVM) on PostgreSQL.
> >
> > I am now working on an initial patch for implementing IVM on PostgreSQL.
> > This enables materialized views to be updated incrementally after one
> > of their base tables is modified.
>
> Attached is a WIP patch of Incremental View Maintenance (IVM).
> Major part is written by me, and changes in syntax and pg_class
> are Hoshiai-san's work.
>
> Although this is sill a draft patch in work-in-progress, any
> suggestions or thoughts would be appreciated.
>
> * What it is
>
> This allows a kind of Immediate Maintenance of materialized views. if a
> materialized view is created by CRATE INCREMENTAL MATERIALIZED VIEW command,
> the contents of the mateview is updated automatically and incrementally
> after base tables are updated. Noted this syntax is just tentative, so it
> may be changed.
>
> ====== Example 1 ======
> postgres=# CREATE INCREMENTAL MATERIALIZED VIEW m AS SELECT * FROM t0;
> SELECT 3
> postgres=# SELECT * FROM m;
> i
> ---
> 3
> 2
> 1
> (3 rows)
>
> postgres=# INSERT INTO t0 VALUES (4);
> INSERT 0 1
> postgres=# SELECt * FROM m; -- automatically updated
> i
> ---
> 3
> 2
> 1
> 4
> (4 rows)
> =============================
>
> This implementation also supports matviews including duplicate tuples or
> DISTINCT clause in its view definition query. For example, even if a matview
> is defined with DISTINCT to remove duplication of tuples in a base table, this
> can perform incremental update of the matview properly. That is, the contents
> of the matview doesn't change when exiting tuples are inserted into the base
> tables, and a tuple in the matview is deleted only when duplicity of the
> corresponding tuple in the base table becomes zero.
>
> This is due to "colunting alogorithm" in which the number of each tuple is
> stored in matviews as a special column value.
>
> ====== Example 2 ======
> postgres=# SELECT * FROM t1;
> id | t
> ----+---
> 1 | A
> 2 | B
> 3 | C
> 4 | A
> (4 rows)
>
> postgres=# CREATE INCREMENTAL MATERIALIZED VIEW m1 AS SELECT t FROM t1;
> SELECT 3
> postgres=# CREATE INCREMENTAL MATERIALIZED VIEW m2 AS SELECT DISTINCT t FROM t1;
> SELECT 3
> postgres=# SELECT * FROM m1; -- with duplicity
> t
> ---
> A
> A
> C
> B
> (4 rows)
>
> postgres=# SELECT * FROM m2;
> t
> ---
> A
> B
> C
> (3 rows)
>
> postgres=# INSERT INTO t1 VALUES (5, 'B');
> INSERT 0 1
> postgres=# DELETE FROM t1 WHERE id IN (1,3); -- delete (1,A),(3,C)
> DELETE 2
> postgres=# SELECT * FROM m1; -- one A left and one more B
> t
> ---
> B
> B
> A
> (3 rows)
>
> postgres=# SELECT * FROM m2; -- only C is removed
> t
> ---
> B
> A
> (2 rows)
> =============================
>
> * How it works
>
> 1. Creating matview
>
> When a matview is created, AFTER triggers are internally created
> on its base tables. When the base tables is modified (INSERT, DELETE,
> UPDATE), the matview is updated incrementally in the trigger function.
>
> When populating the matview, GROUP BY and count(*) are added to the
> view definition query before this is executed for counting duplicity
> of tuples in the matview. The result of count is stored in the matview
> as a special column named "__ivm_count__".
>
> 2. Maintenance of matview
>
> When base tables are modified, the change set of the table can be
> referred as Ephemeral Named Relations (ENRs) thanks to Transition Table
> (a feature of trigger implemented since PG10). We can calculate the diff
> set of the matview by replacing the base table in the view definition
> query with the ENR (at least if it is Selection-Projection -Join view).
> As well as view definition time, GROUP BY and count(*) is added in order
> to count the duplicity of tuples in the diff set. As a result, two diff
> sets (to be deleted from and to be inserted into the matview) are
> calculated, and the results are stored into temporary tables respectively.
>
> The matiview is updated by merging these change sets. Instead of executing
> DELETE or INSERT simply, the values of __ivm_count__ column in the matview
> is decreased or increased. When the values becomes zero, the corresponding
> tuple is deleted from the matview.
>
> 3. Access to matview
>
> When SELECT is issued for IVM matviews defined with DISTINCT, all columns
> except to __ivm_count__ of each tuple in the matview are returned. This is
> correct because duplicity of tuples are eliminated by GROUP BY.
>
> When DISTINCT is not used, SELECT for the IVM matviews returns each tuple
> __ivm_count__ times. Currently, this is implemented by rewriting the SELECT
> query to replace the matview RTE with a subquery which joins the matview
> and generate_series function as bellow.
>
> SELECT mv.* FROM mv, generate_series(1, mv.__ivm_count__);
>
> __ivm_count__ column is invisible for users when "SELECT * FROM ..." is
> issued, but users can see the value by specifying in target list explicitly.
>
> ====== Example 3 ======
> postgres=# \d+ m1
> Materialized view "public.m1"
> Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
> ---------------+--------+-----------+----------+---------+----------+--------------+-------------
> t | text | | | | extended | |
> __ivm_count__ | bigint | | | | plain | |
> View definition:
> SELECT t1.t
> FROM t1;
> Access method: heap
>
> postgres=# \d+ m2
> Materialized view "public.m2"
> Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
> ---------------+--------+-----------+----------+---------+----------+--------------+-------------
> t | text | | | | extended | |
> __ivm_count__ | bigint | | | | plain | |
> View definition:
> SELECT DISTINCT t1.t
> FROM t1;
> Access method: heap
>
> postgres=# SELECT *, __ivm_count__ FROM m1;
> t | __ivm_count__
> ---+---------------
> B | 2
> B | 2
> A | 1
> (3 rows)
>
> postgres=# SELECT *, __ivm_count__ FROM m2;
> t | __ivm_count__
> ---+---------------
> B | 2
> A | 1
> (2 rows)
>
> postgres=# EXPLAIN SELECT * FROM m1;
> QUERY PLAN
> ------------------------------------------------------------------------------
> Nested Loop (cost=0.00..61.03 rows=3000 width=2)
> -> Seq Scan on m1 mv (cost=0.00..1.03 rows=3 width=10)
> -> Function Scan on generate_series (cost=0.00..10.00 rows=1000 width=0)
> (3 rows)
> =============================
>
> * Simple Performance Evaluation
>
> I confirmed that "incremental" update of matviews is more effective
> than the standard REFRESH by using simple exapmle. I used tables
> of pgbench (SF=100) here.
>
> Create two matviews, that is, without and with IVM.
>
> test=# CREATE MATERIALIZED VIEW bench1 AS
> SELECT aid, bid, abalance, bbalance
> FROM pgbench_accounts JOIN pgbench_branches USING (bid)
> WHERE abalance > 0 OR bbalance > 0;
> SELECT 5001054
> test=# CREATE INCREMENTAL MATERIALIZED VIEW bench2 AS
> SELECT aid, bid, abalance, bbalance
> FROM pgbench_accounts JOIN pgbench_branches USING (bid)
> WHERE abalance > 0 OR bbalance > 0;
> SELECT 5001054
>
> The standard REFRESH of bench1 took more than 10 seconds.
>
> test=# \timing
> Timing is on.
> test=# REFRESH MATERIALIZED VIEW bench1 ;
> REFRESH MATERIALIZED VIEW
> Time: 11210.563 ms (00:11.211)
>
> Create an index on the IVM matview (bench2).
>
> test=# CREATE INDEX on bench2(aid,bid);
> CREATE INDEX
>
> Updating a tuple in pgbench_accounts took 18ms. After this, bench2
> was updated automatically and correctly.
>
> test=# SELECT * FROM bench2 WHERE aid = 1;
> aid | bid | abalance | bbalance
> -----+-----+----------+----------
> 1 | 1 | 10 | 10
> (1 row)
>
> Time: 2.498 ms
> test=# UPDATE pgbench_accounts SET abalance = 1000 WHERE aid = 1;
> UPDATE 1
> Time: 18.634 ms
> test=# SELECT * FROM bench2 WHERE aid = 1;
> aid | bid | abalance | bbalance
> -----+-----+----------+----------
> 1 | 1 | 1000 | 10
> (1 row)
>
> However, if there is not the index on bench2, it took 4 sec, so
> appropriate indexes are needed on IVM matviews.
>
> test=# DROP INDEX bench2_aid_bid_idx ;
> DROP INDEX
> Time: 10.613 ms
> test=# UPDATE pgbench_accounts SET abalance = 2000 WHERE aid = 1;
> UPDATE 1
> Time: 3931.274 ms (00:03.931)
>
> * Restrictions on view definition
>
> This patch is still in Work-in-Progress and there are many restrictions
> on the view definition query of matviews.
>
> The current implementation supports views including selection, projection,
> and inner join with or without DISTINCT. Aggregation and GROUP BY are not
> supported yet, but I plan to deal with these by the first release.
> Self-join, subqueries, OUTER JOIN, CTE, window functions are not
> considered well, either. I need more investigation on these type of views
> although I found some papers explaining how to handle sub-queries and
> outer-joins.
>
> These unsupported views should be checked when a matview is created, but
> this is not implemented yet. Hoshiai-san are working on this.
>
> * Timing of view maintenance
>
> This patch implements a kind of Immediate Maintenance, that is, a matview
> is updated immediately when a base table is modified. On other hand, in
> "Deferred Maintenance", matviews are updated after the transaction, for
> example, by the user command like REFRESH.
>
> For implementing "deferred", it is need to implement a mechanism to maintain
> logs for recording changes of base tables and an algorithm to compute the
> delta to be applied to matviews.
>
> In addition, there could be another implementation of Immediate Maintenance
> in which matview is updated at the end of a transaction that modified base
> table, rather than in AFTER trigger. Oracle supports this type of IVM. To
> implement this, we will need a mechanism to maintain change logs on base
> tables as well as Deferred maintenance.
>
> * Counting algorithm implementation
>
> There will be also discussions on counting-algorithm implementation.
> Firstly, the current patch treats "__ivm_count__" as a special column name
> in a somewhat ad hoc way. This is used when maintaining and accessing matviews,
> and when "SELECT * FROM ..." is issued, __ivm_count__ column is invisible for
> users. Maybe this name has to be inhibited in user tables. Is it acceptable
> to use such columns for IVM, and is there better way, if not?
>
> Secondly, a matview with duplicate tuples is replaces with a subquery which
> uses generate_series function. It does not have to be generate_series, and we
> can make a new set returning function for this. Anyway, this internal behaviour
> is visible in EXPLAIN results as shown in Example 3. Also, there is a
> performance impact because estimated rows number is wrong, and what is worse,
> the cost of join is not small when the size of matview is large. Therefore, we
> might have to add a new plan node for selecting from matviews rather than using
> such a special set returning function.
>
>
> Ragards,
> --
> Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
--
Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
Attachment | Content-Type | Size |
---|---|---|
WIP_immediate_IVM_v2.patch | text/x-diff | 44.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Dilip Kumar | 2019-06-20 08:54:01 | Re: POC: Cleaning up orphaned files using undo logs |
Previous Message | Peter Eisentraut | 2019-06-20 07:30:34 | unlogged sequences |