From: | Vladlen Popolitov <v(dot)popolitov(at)postgrespro(dot)ru> |
To: | pgsql-hackers(at)postgresql(dot)org, hukutoc(at)gmail(dot)com, andres(at)anarazel(dot)de |
Subject: | Re: PoC. The saving of the compiled jit-code in the plan cache |
Date: | 2025-02-13 14:49:38 |
Message-ID: | |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Dear colleagues,
It is updated patch (compiled with current master, the same
description, as
below, no other features yet)
If you need to patch the same master version, you can:
git checkout 83ea6c54025bea67bcd4949a6d58d3fc11c3e21b
patch -p1 <v5-0001-jit-saved-for-cached-plans.patch
> Vladlen Popolitov писал(а) 2025-02-13 16:01:
> Dear colleagues,
> I implemented the patch to store in the plan cache an already compiled
> jit
> code. If a query is executed many times, the query time can be
> decreased by
> 20-30% for long queries with big share of expressions in the code.
> This patch can be applied to the current master and passes tests, but
> it is
> only the Proof of the Concept, and I need to get opinions and advice
> for nexts
> steps. It is not ready for commitfest, does not cover all scenarios
> (craches
> in some sutuation that need special solution), and source code is not
> clean.
> The code source has removed old code marked by // comments to easier
> compare,
> what was changed, especially in llvmjit_expr.c and llvmjit_deform.c
> Implementation details.
> 1. Changes in jit-code generation.
> a) the load of the absolute address (as const) changed to the load of
> this
> address from a struct member:
> old version:
> v_resvaluep = l_ptr_const(op->resvalue, l_ptr(TypeSizeT));
> new version
> v_resvaluep = l_load_struct_gep( b, g->StructExprEvalStep, v_op,
> b) the load of the absolute address or the value from union changed to
> the load of the union value using the offset of the union member
> old version
> fcinfo - the union member value
> v_fcinfo = l_ptr_const(fcinfo, l_ptr(g->StructFunctionCallInfoData));
> new version
> v_fcinfo = l_load_member_value_by_offset(b,lc, v_op,
> l_ptr(g->StructFunctionCallInfoData), offsetof(ExprEvalStep,
> d.func.fcinfo_data ));
> c) every cached plan has own LLVM-context and 1 module in it. As
> result,
> some functions require context parameter to correctly address it
> instead
> of the global llvm_context variable.
> d) llvm_types moved from global variables to the struct LLVMJitTypes.
> Not cached queries use old setup, but all types definitions are stored
> in global struct LLVMJitTypes llvm_jit_types_struct, instead of many
> global variables. Chached queries allocate struct LLVMJitTypes during
> query
> executon, and pfree it when cached plan is deallocated. llvm_types
> module is
> read for every cached statement (not 1 time during global context
> creation).
> e) llvm_function_reference still uses l_ptr_const with the absolute
> address
> of the function, it needs to be checked, as the function can change
> address.
> f) new GUC variabe jit_cached is added, off by default.
> 2. Changes in cached plan.
> To store cached context I use struct PlannedStmt (in ExprState
> state->parent->state->es_plannedstmt).
> New member struct CachedJitContext *jit_context added to store
> information
> about jit context. In new plans this field in NULL. If plan is stored
> to cache,
> it filled by CACHED_JITCONTEXT_EMPTY and replaced by actual jit context
> pointer
> in llvmjit_expr.c .
> When cached plan is deallocated, this member used to free jit resources
> and
> memory.
> 3. Current problems.
> When module is compiled, it needs to connect ExprState and function
> addresses.
> In old implementation funcname is stored in ExprState.
> In cached implementation the fuction will be called second and more
> times,
> and calling code does not have information about function name for
> ExprState.
> a) The current cached implementation uses the number of the call to
> connect
> f.e. 1st expresion and 1st function. It does not work for queries, that
> generates an expression after module compilation (f.e. HashJoin) and
> tries
> to compile it.
> b) Also query with aggregate functions generates an expresion every
> time, when
> executed. It work with current approach, but these expressions should
> be
> considered, as they have different expression address every time.
> One of the solution for a) and b) - generate jit-code for Expr pointers
> only
> in cached plan. Every new created expression either should be ignored
> by
> the compiler and executed by the standard interpreter, or compiled in
> the
> standard (not cached) jit (it will run compilation every query run and
> eliminate all gain from jit-cache).
> c) new jit-code generation (use the stuct member instead of the direct
> absolute address) slightly decreases the jit-code performance. It is
> possible
> to compile old version for not cached queries, and new code for cached
> queries.
> In this case two big 3000 lines funtions llvm_compile_expr() need to be
> maintained in similar way, when new expresiions or features are added.
> Attached files have
> 1) the patch (branched from 83ea6c54025bea67bcd4949a6d58d3fc11c3e21b
> master),
> 2 and 3) benchmark files jitinit.sql to create jitbench database and
> bash
> script (change to own user and password if you need) to run
> banchmark.
> 4) chart for this benchmark and the query in the benchmark (comparison
> with jit=off as 1 unit). It is easy to find query, where jit is higher
> or
> lower than jit-off. Here I demonstate the difference of standard and
> new
> jit-code (the decrease of the performance with compilation without
> optimization), and high gain of cached version with optimization and
> high lost of not cached version with optimization due to the running
> of the optimization for every the query.
Best regards,
Vladlen Popolitov.
Attachment | Content-Type | Size |
v5-0001-jit-saved-for-cached-plans.patch | text/x-diff | 142.1 KB |
From | Date | Subject | |
Next Message | Oliver Ford | 2025-02-13 15:02:57 | Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options |
Previous Message | vignesh C | 2025-02-13 14:41:48 | Re: Restrict publishing of partitioned table with a foreign table as partition |