Re: JIT performance question

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tobias Gierke <tobias(dot)gierke(at)code-sourcery(dot)de>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: JIT performance question
Date: 2019-03-06 17:42:58
Message-ID: 20190306174258.f23avt6k3vtgqsqr@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

On 2019-03-06 18:16:08 +0100, Tobias Gierke wrote:
> I was playing around with PG11.2 (i6700k with 16GB RAM, on Ubuntu 18.04,
> compiled from sources) and LLVM, trying a CPU-bound query that in my simple
> mind should benefit from JIT'ting but (almost) doesn't.
>
> 1.) Test table with 195 columns of type 'numeric':
>
> CREATE TABLE test (data0 numeric,data1 numeric,data2 numeric,data3
> numeric,...,data192 numeric,data193 numeric,data194 numeric);
>
> 2.) bulk-loaded (via COPY) 2 mio. rows of randomly generated data into this
> table (and ran vacuum & analyze afterwards)
>
> 3.) Disable parallel workers to just measure JIT performance via 'set
> max_parallel_workers = 0'

FWIW, it's better to do that via max_parallel_workers_per_gather in most
cases, because creating a parallel plan and then not using that will
have its own consequences.

> 4.) Execute query without JIT a couple of times to make sure table is in
> memory (I had iostat running in the background to verify that actually no
> disk access was taking place):

There's definitely accesses outside of PG happening here :(. Probably
cached at the IO level, but without track_io_timings that's hard to
confirm. Presumably that's caused by the sequential scan ringbuffers.
I found that forcing the pages to be read in using pg_prewarm gives more
measurable results.

> So (ignoring the time for JIT'ting itself) this yields only ~2-3%
> performance increase... is this because my query is just too simple to
> actually benefit a lot, meaning the code path for the 'un-JIT' case is
> already fairly optimal ? Or does JIT'ting actually only have a large impact
> on the filter/WHERE part of the query but not so much on aggregation / tuple
> deforming ?

It's hard to know precisely without running a profile of the
workload. My suspicion is that the bottleneck in this query is the use
of numeric, which has fairly slow operations, including aggregation. And
they're too complicated to be inlined.

Generally there's definitely advantage in JITing aggregation.

There's a lot of further improvements on the table with better JIT code
generation, I just haven't gotten around implementing those :(

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tobias Gierke 2019-03-06 18:21:33 Re: JIT performance question
Previous Message Mariel Cherkassky 2019-03-06 17:16:30 Re: autovacuum just stop vacuuming specific table for 7 hours