Re: JIT compilation per plan node

From: Matheus Alcantara <matheusssilv97(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: JIT compilation per plan node
Date: 2024-12-19 14:49:37
Message-ID: CAFY6G8e3tsH_samSdwk-CC6ZXeAvRyECtvAF=D1UW_D5eTfEsw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

It seems quite a while since the last email, but I was taking a look at this
thread and I think that we can have another, but similar, approach for this
proposal.

Em ter., 14 de mai. de 2024 às 01:10, David Rowley
<dgrowleyml(at)gmail(dot)com> escreveu:
> Currently, during execution, ExecCreateExprSetupSteps() traverses the
> Node tree of the Expr to figure out the max varattno of for each slot.
> That's done so all of the tuple deforming happens at once rather than
> incrementally. Figuring out the max varattno is a price we have to pay
> for every execution of the query. I think we'd be better off doing
> that in the planner.
>
> To do this, I thought that setrefs.c could do this processing in
> fix_join_expr / fix_upper_expr and wrap up the expression in a new
> Node type that stores the max varattno for each special var type.
>
> This idea is related to this discussion because another thing that
> could be stored in the very same struct is the "num_exec" value. I
> feel the number of executions of an ExprState is a better gauge of how
> useful JIT will be than the cost of the plan node. Now, looking at
> set_join_references(), the execution estimates are not exactly
> perfect. For example;
>
The fix_join_expr and fix_upper_expr functions have been evolved that now both
receives a "num_exec" parameter which can be derived from plan->plan_rows via
NUM_EXEC_TLIST macro. After a plan is created on create_plan_recurse we have
both plan_rows and total_cost filled, so it seems to me that all information
needed to decide if a plan node should be compiled or not is already present on
Plan struct?

If that's the case, I was thinking, what if we add a new field jit on
Plan struct
and fill this value after a plan is created on create_plan_recurse? Similar
idea with the first patch:

static void
plan_consider_jit(Plan *plan)
{
plan->jit = false;

if (jit_enabled)
{
Cost total_cost;

total_cost = plan->total_cost * plan->plan_rows;

if (total_cost > jit_above_cost)
plan->jit = true;
}
}

And then when jit_compile_expr is called we can check if the jit is enabled for
ExprState->parent->plan node.

This make any kind of sense? I'm not sure but I've executed some benchmarks.
I've used the same test scenario used on [1], which is:

CREATE TABLE listp(a int, b int) PARTITION BY LIST(a);
SELECT 'CREATE TABLE listp'|| x || ' PARTITION OF listp FOR VALUES IN
('||x||');' FROM generate_Series(1,1000) x; \gexec
INSERT INTO listp SELECT 1,x FROM generate_series(1,10000000) x;

EXPLAIN (VERBOSE, ANALYZE) SELECT COUNT(*) FROM listp WHERE b < 0;

Results:

master jit=off:
Planning Time: 59.457
Execution Time: 457.000

master jit=on:
Planning Time: 59.529 ms
JIT:
Functions: 9008
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 99.267 ms (Deform 37.434 ms), Inlining 0.000 ms,
Optimization 309.079 ms, Emission 517.495 ms, Total 925.841 ms
Execution Time: 674.756 ms

patch jit=off
Planning Time: 60.906 ms
Execution Time: 453.978 ms

patch jit=on:
Planning Time: 67.625 ms
JIT:
Functions: 17
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 0.502 ms (Deform 0.073 ms), Inlining 0.000 ms,
Optimization 0.998 ms, Emission 3.915 ms, Total 5.415 ms
Execution Time: 328.239 ms

Note that I used the jit_above_cost default value for both tests.

I've executed the same EXPLAIN query 100 times on master and with the patch and
all results seem similar with the above.

jit=off
Master mean: 438.898
Patch mean: 436.036

jit=on
Master mean: 665.730
Patch mean: 347.758

I'm a bit concerned if it's a good idea to add the jit field on Plan node, but I
also I'm not sure if this approach makes sense. Attached a simple patch that
play with the idea.

WYT?

[1] https://www.postgresql.org/message-id/CAGPVpCSBn_-t3jvpmmhHsqNs2NOGo1tSBbyZNg1CjGgAcQJk+Q@mail.gmail.com

--
Matheus Alcantara

Attachment Content-Type Size
v1-0001-JIT-compilation-per-plan-node.patch application/x-patch 4.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nazir Bilal Yavuz 2024-12-19 14:50:43 Re: Removing the pgstat_flush_io() call from the walwriter
Previous Message Greg Sabino Mullane 2024-12-19 13:57:47 Re: Send duration output to separate log files