Re: Consecutive Query Executions with Increasing Execution Time

From: Shijia Wei <shijiawei(at)utexas(dot)edu>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Consecutive Query Executions with Increasing Execution Time
Date: 2019-12-16 17:28:09
Message-ID: CABSfb-55VAcUwvy+PtsBsEk8EMfODtC6p65qVHSYhOqWWf80mA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Laurenz,

Each time the identical query executes, the total number of rows selected
stays the same. The table is actually not modified between/during runs.

The query plan stays the same between fast and slow runs. Please find two
copied here:
The first one is the output of the first query in the loop 1-to-20; The
second one is the output of the last query (20th).

1st Query:
Finalize Aggregate (cost=126840.37..126840.38 rows=1 width=12) (actual
time=178.825..178.826 rows=1 loops=1)
Buffers: shared hit=17074 read=16388
-> Gather (cost=126839.73..126840.34 rows=6 width=12) (actual
time=178.786..180.064 rows=7 loops=1)
Workers Planned: 6
Workers Launched: 6
Buffers: shared hit=17074 read=16388
-> Partial Aggregate (cost=125839.73..125839.74 rows=1 width=12)
(actual time=176.781..176.781 rows=1 loops=7)
Buffers: shared hit=17074 read=16388
-> Parallel Index Only Scan using i_l_shipdate on lineitem
(cost=0.43..120842.46 rows=999455 width=4) (actual time=0.045..114.871
rows=856704 loops=7)
Index Cond: (l_shipdate <= '1998-11-11
00:00:00'::timestamp without time zone)
Heap Fetches: 0
Buffers: shared hit=17074 read=16388
Planning Time: 0.458 ms
Execution Time: 180.111 ms
(14 rows)

20th Query:
Finalize Aggregate (cost=126840.37..126840.38 rows=1 width=12) (actual
time=223.928..223.929 rows=1 loops=1)
Buffers: shared hit=17037 read=16390
-> Gather (cost=126839.73..126840.34 rows=6 width=12) (actual
time=223.856..225.474 rows=7 loops=1)
Workers Planned: 6
Workers Launched: 6
Buffers: shared hit=17037 read=16390
-> Partial Aggregate (cost=125839.73..125839.74 rows=1 width=12)
(actual time=221.918..221.918 rows=1 loops=7)
Buffers: shared hit=17037 read=16390
-> Parallel Index Only Scan using i_l_shipdate on lineitem
(cost=0.43..120842.46 rows=999455 width=4) (actual time=0.062..143.808
rows=856704 loops=7)
Index Cond: (l_shipdate <= '1998-11-11
00:00:00'::timestamp without time zone)
Heap Fetches: 0
Buffers: shared hit=17037 read=16390
Planning Time: 0.552 ms
Execution Time: 225.529 ms
(14 rows)

One difference I noticed here is that "actual time" of the Parallel Index
Only Scan increased from 114ms to 143ms.
The same holds for other examples that involve Parallel Seq Scan.

Thanks,
Shijia

On Mon, Dec 16, 2019 at 7:25 AM Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
wrote:

> On Sun, 2019-12-15 at 23:59 -0600, Shijia Wei wrote:
> > I am running TPC-H on recent postgresql (12.0 and 12.1).
> > On some of the queries (that may involve parallel scans) I see this
> interesting behavior:
> > When these queries are executed back-to-back (sent from psql interactive
> terminal), the total execution time of them increase monotonically.
> >
> > I simplified query-1 to demonstrate this effect:
> > ``` example.sql
> > explain (analyze, buffers) select
> > max(l_shipdate) as max_data,
> > count(*) as count_order
> > from
> > lineitem
> > where
> > l_shipdate <= date '1998-12-01' - interval '20' day;
> > ```
> >
> > When I execute (from fish) following command:
> > `for i in (seq 1 20); psql tpch < example.sql | grep Execution; end`
> > The results I get are as follows:
> > "
> > Execution Time: 184.864 ms
> > Execution Time: 192.758 ms
> > Execution Time: 197.380 ms
> > Execution Time: 200.384 ms
> > Execution Time: 202.950 ms
> > Execution Time: 205.695 ms
> > Execution Time: 208.082 ms
> > Execution Time: 209.108 ms
> > Execution Time: 212.428 ms
> > Execution Time: 214.539 ms
> > Execution Time: 215.799 ms
> > Execution Time: 219.057 ms
> > Execution Time: 222.102 ms
> > Execution Time: 223.779 ms
> > Execution Time: 227.819 ms
> > Execution Time: 229.710 ms
> > Execution Time: 239.439 ms
> > Execution Time: 237.649 ms
> > Execution Time: 249.178 ms
> > Execution Time: 261.268 ms
>
> I don't know TPC-H, but the slowdown is not necessarily surprising:
>
> If the number of rows that satisfy the condition keeps growing over time,
> counting those rows will necessarily take longer.
>
> Maybe you can provide more details, for example EXPLAIN (ANALYZE, BUFFERS)
> output for the query when it is fast and when it is slow.
>
> Yours,
> Laurenz Albe
> --
> +43-670-6056265
> Cybertec Schönig & Schönig GmbH
> Gröhrmühlgasse 26, A-2700 Wiener Neustadt
> Web: https://www.cybertec-postgresql.com
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Pavel Stehule 2019-12-16 17:29:12 Re: performance degredation after upgrade from 9.6 to 12
Previous Message Tom Lane 2019-12-16 13:50:36 Re: Consecutive Query Executions with Increasing Execution Time