Re: Implementing Incremental View Maintenance

From: Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Takuma Hoshiai <hoshiai(at)sraoss(dot)co(dot)jp>, Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>, michael(at)paquier(dot)xyz, amitlangote09(at)gmail(dot)com, alvherre(at)2ndquadrant(dot)com, thomas(dot)munro(at)gmail(dot)com, kgrittn(at)gmail(dot)com
Subject: Re: Implementing Incremental View Maintenance
Date: 2020-04-13 05:18:35
Message-ID: 20200413141835.04fcdf6424985369554b6a4d@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 10 Apr 2020 23:26:58 +0900
Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp> wrote:

> Hi,
>
> Attached is the latest patch (v15) to add support for Incremental Materialized
> View Maintenance (IVM). It is possible to apply to current latest master branch.

I found a mistake of splitting patch, so I attached the fixed patch (v15a).

> Differences from the previous patch (v14) include:
>
> * Fix to not use generate_series when views are queried
>
> In the previous implementation, multiplicity of each tuple was stored
> in ivm_count column in views. When SELECT was issued for views with
> duplicate, the view was replaced with a subquery in which each tuple
> was joined with generate_series function in order to output tuples
> of the number of ivm_count.
>
> This was problematic for following reasons:
>
> - The overhead was huge. When almost of tuples in a view were selected,
> it took much longer time than the original query. This lost the meaning
> of materialized views.
>
> - Optimizer could not estimate row numbers correctly because this had to
> know ivm_count values stored in tuples.
>
> - System columns of materialized views like cmin, xmin, xmax could not
> be used because a view was replaced with a subquery.
>
> To resolve this, the new implementation doen't store multiplicities
> for views with tuple duplicates, and doesn't use generate_series
> when SELECT query is issued for such views.
>
> Note that we still have to use ivm_count for supporting DISTINCT and
> aggregates.

I also explain the way of updating views with tuple duplicates.

Although a view itself doesn't have ivm_count column, multiplicities
for old delta and new delta are calculated and the count value is
contained in a column named __ivm_count__ in each delta table.

The old delta table is applied using ctid and row_number function.
row_number is used to numbering tuples in the view, and tuples whose
number is equal or is less than __ivm_count__ are deleted from the
view using a query like:

DELETE FROM matviewname WHERE ctid IN (
SELECT tid FROM (
SELECT row_number() over (partition by c1, c2, ...) AS __ivm_row_number__,
mv.ctid AS tid,
diff.__ivm_count__
FROM matviewname AS mv, old_delta AS diff "
WHERE mv.c1 = diff.c1 AND mv.c2 = diff.c2 AND ... ) v
WHERE v.__ivm_row_number__ <= v.__ivm_count__

The new delta is applied using generate_seriese to insert mutiple same
tuples, using a query like:

INSERT INTO matviewname (c1, c2, ...)
SELECT c1,c2,... FROM (
SELECT diff.*, generate_series(

>
> * Add query checks for IVM restrictions
>
> Query checks for following restrictions are added:
>
> - DISTINCT ON
> - TABLESAMPLE parameter
> - inheritance parent table
> - window function
> - some aggregate options(such as FILTER, DISTINCT, ORDER and GROUPING SETS)
> - targetlist containing IVM column
> - simple subquery is only supported
> - FOR UPDATE/SHARE
> - empty target list
> - UNION/INTERSECT/EXCEPT
> - GROUPING SETS clauses
>
> * Improve error messages
>
> Add error code ERRCODE_FEATURE_NOT_SUPPORTED to each IVM error message.
> Also, the message format was unified.
>
> * Support subqueries containig joins in FROM clause
>
> Previously, when multi tables are updated simultaneously, incremental
> view maintenance with subqueries including JOIN didn't work correctly
> due to a bug.
>
> Best Regards,
> Takuma Hoshiai
>
> --
> Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>

--
Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>

Attachment Content-Type Size
IVM_patches_v15a.tar.gz application/gzip 75.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-04-13 06:10:51 Re: WAL usage calculation patch
Previous Message Amit Kapila 2020-04-13 05:14:42 Re: doc review for parallel vacuum