Re: Avoid excessive inlining?

From: "Joel Jacobson" <joel(at)compiler(dot)org>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Laurenz Albe" <laurenz(dot)albe(at)cybertec(dot)at>, "Philip Semanchuk" <philip(at)americanefficient(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Avoid excessive inlining?
Date: 2021-01-02 19:06:54
Message-ID: a4319646-1c78-4631-a6d2-330144498d11@www.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I found yet another trick, which actually seems to be slightly faster than the plpgsql version.

The trick is to use VIEW ... WITH (security_barrier) to tell the optimizer it shouldn’t flatten the subqueries.

CREATE TABLE eastern (year integer);

INSERT INTO eastern (year) SELECT generate_series(1,100000);

CREATE VIEW v0 WITH (security_barrier) AS (
SELECT
year,
year % 19 AS g,
year / 100 AS c
FROM eastern
);
CREATE VIEW v1 WITH (security_barrier) AS (
SELECT
year,
g,
c,
(c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 AS h
FROM v0
);
CREATE VIEW v2 WITH (security_barrier) AS (
SELECT
year,
c,
h - (h/28)*(1 - (h/28)*(29/(h + 1))*((21 - g)/11)) AS i
FROM v1
);
CREATE VIEW v3 WITH (security_barrier) AS (
SELECT
year,
i,
(year + year/4 + i + 2 - c + c/4) % 7 AS j
FROM v2
);
CREATE VIEW v4 WITH (security_barrier) AS (
SELECT
year,
i - j AS p
FROM v3
);
CREATE VIEW v5 WITH (security_barrier) AS (
SELECT
year,
3 + (p + 26)/30 AS easter_month,
1 + (p + 27 + (p + 6)/40) % 31 AS easter_day
FROM v4
);

EXPLAIN ANALYZE VERBOSE
SELECT make_date(year, easter_month, easter_day) FROM v5;

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan on v5 (cost=0.00..17943.00 rows=100000 width=4) (actual time=0.025..213.996 rows=100000 loops=1)
Output: make_date(v5.year, v5.easter_month, v5.easter_day)
-> Subquery Scan on v4 (cost=0.00..16693.00 rows=100000 width=12) (actual time=0.024..191.448 rows=100000 loops=1)
Output: v4.year, (3 + ((v4.p + 26) / 30)), (1 + (((v4.p + 27) + ((v4.p + 6) / 40)) % 31))
-> Subquery Scan on v3 (cost=0.00..13443.00 rows=100000 width=8) (actual time=0.022..155.690 rows=100000 loops=1)
Output: v3.year, (v3.i - v3.j)
-> Subquery Scan on v2 (cost=0.00..12193.00 rows=100000 width=12) (actual time=0.021..135.485 rows=100000 loops=1)
Output: v2.year, v2.i, ((((((v2.year + (v2.year / 4)) + v2.i) + 2) - v2.c) + (v2.c / 4)) % 7)
-> Subquery Scan on v1 (cost=0.00..9193.00 rows=100000 width=12) (actual time=0.019..97.935 rows=100000 loops=1)
Output: v1.year, v1.c, (v1.h - ((v1.h / 28) * (1 - (((v1.h / 28) * (29 / (v1.h + 1))) * ((21 - v1.g) / 11)))))
-> Subquery Scan on v0 (cost=0.00..5443.00 rows=100000 width=16) (actual time=0.017..57.988 rows=100000 loops=1)
Output: v0.year, v0.g, v0.c, (((((v0.c - (v0.c / 4)) - (((8 * v0.c) + 13) / 25)) + (19 * v0.g)) + 15) % 30)
-> Seq Scan on public.eastern (cost=0.00..1943.00 rows=100000 width=12) (actual time=0.015..23.908 rows=100000 loops=1)
Output: eastern.year, (eastern.year % 19), (eastern.year / 100)
Planning Time: 0.274 ms
Execution Time: 220.698 ms
(16 rows)

EXPLAIN ANALYZE VERBOSE
SELECT easter_plpgsql(year) FROM eastern;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.eastern (cost=0.00..26443.00 rows=100000 width=4) (actual time=0.077..301.519 rows=100000 loops=1)
Output: easter_plpgsql(year)
Planning Time: 0.049 ms
Execution Time: 309.119 ms
(4 rows)

On Tue, Dec 22, 2020, at 17:32, Tom Lane wrote:
> "Joel Jacobson" <joel(at)compiler(dot)org> writes:
> > I think I was a bit unclear about my problem, and might have used the wrong terminology.
> > In my LATERAL query, there are calculations in a certain order.
> > For each step, "columns" are computed named e.g. "g", "c", "h", "i", etc.
> > However, when looking at the query plan, these steps are gone, and instead there is just one huge fully expanded expression, which doesn't look very efficient.
>
> Yeah, this isn't really about function inlining, it's about subquery
> flattening (which is similar in some ways, but not the same thing).
>
> Unfortunately, subquery flattening happens early enough in the planner
> that there's no chance of making any useful cost comparisons to decide
> whether to do it or not. So we just do it unconditionally. I'm
> not really sure that failing to do it would provide a better outcome
> in this situation anyway --- sure, you'd save a few scalar calculations,
> but the overhead of running additional plan nodes could outweigh that.
>
> The long and the short of it is that SQL isn't terribly well suited to
> execute a fundamentally stepwise, imperative algorithm like this one.
> Rather than hacking up cute tricks with LATERAL, you should just use
> a language that *is* well suited. That's why we provide PLs.
>
> FWIW, another trick for inserting optimization fences is WITH.
> So you could do something like
>
> WITH Q1(g,c) AS MATERIALIZED
> (SELECT year % 19, year / 100),
> Q2(h) AS MATERIALIZED
> (SELECT (c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 FROM Q1),
> ...
> SELECT make_date(year, easter_month, easter_day) FROM Q6;
>
> But I'd bet lunch that that won't be faster for this example,
> because there's a lot of overhead in CTEs.
>
> regards, tom lane
>

Kind regards,

Joel

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Pavel Stehule 2021-01-02 19:37:13 Re: Avoid excessive inlining?
Previous Message Adrian Klaver 2021-01-02 15:36:21 Re: Trigger with conditional predicates