Re: Consecutive Query Executions with Increasing Execution Time

From: Shijia Wei <shijiawei(at)utexas(dot)edu>
To: Olivier Gautherot <ogautherot(at)gautherot(dot)net>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Consecutive Query Executions with Increasing Execution Time
Date: 2019-12-16 09:51:24
Message-ID: CABSfb-5ZcO48K5sS0oM3njQYpTo-K2MdxMZ1pM0DpiBFGPqw9g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Olivier,

I do not think that the queries are executed concurrently. The bash for
loop ensures that the next command fires only after the first returns.
Also for some 'complex' queries, even a wait-period that is longer than the
total execution time does not completely avoid this effect.
For example, a wait-period of 5-second in between queries that take
2-second to run, does not help avoid the increasing runtime problem
completely.

Thanks,
Shijia

On Mon, Dec 16, 2019 at 2:04 AM Olivier Gautherot <ogautherot(at)gautherot(dot)net>
wrote:

> Hi Shijia,
>
> It sounds like concurrency on the queries: the second starts before the
> first ends, and so on. With a short wait in between you ensure sequential
> execution. Notice that you also have the overhead of concurrent psql...
>
> Sounds normal to me.
>
> Best regards
> Olivier
>
>
> On Mon, Dec 16, 2019, 07:00 Shijia Wei <shijiawei(at)utexas(dot)edu> wrote:
>
>> Hi all,
>>
>> I am running TPC-H on recent postgresql (12.0 and 12.1).
>> On some of the queries (that may involve parallel scans) I see this
>> interesting behavior:
>> When these queries are executed back-to-back (sent from psql interactive
>> terminal), the total execution time of them increase monotonically.
>>
>> I simplified query-1 to demonstrate this effect:
>> ``` example.sql
>> explain (analyze, buffers) select
>> max(l_shipdate) as max_data,
>> count(*) as count_order
>> from
>> lineitem
>> where
>> l_shipdate <= date '1998-12-01' - interval '20' day;
>> ```
>>
>> When I execute (from fish) following command:
>> `for i in (seq 1 20); psql tpch < example.sql | grep Execution; end`
>> The results I get are as follows:
>> "
>> Execution Time: 184.864 ms
>> Execution Time: 192.758 ms
>> Execution Time: 197.380 ms
>> Execution Time: 200.384 ms
>> Execution Time: 202.950 ms
>> Execution Time: 205.695 ms
>> Execution Time: 208.082 ms
>> Execution Time: 209.108 ms
>> Execution Time: 212.428 ms
>> Execution Time: 214.539 ms
>> Execution Time: 215.799 ms
>> Execution Time: 219.057 ms
>> Execution Time: 222.102 ms
>> Execution Time: 223.779 ms
>> Execution Time: 227.819 ms
>> Execution Time: 229.710 ms
>> Execution Time: 239.439 ms
>> Execution Time: 237.649 ms
>> Execution Time: 249.178 ms
>> Execution Time: 261.268 ms
>> "
>> In addition, if the repeated more times, the total execution time can end
>> up being 10X and more!!!
>>
>> When there a wait period in-between queries, (e.g. sleep 10) in the
>> above for loop, this increasing execution time behavior goes a way.
>> For more complex queries, the "wait period" needs to be longer to avoid
>> the increase in execution time.
>>
>> Some metadata about this table "lineitem":
>> tpch=# \d lineitem
>> Table "public.lineitem"
>> Column | Type | Collation | Nullable | Default
>> -----------------+-----------------------+-----------+----------+---------
>> l_orderkey | integer | | not null |
>> l_partkey | integer | | not null |
>> l_suppkey | integer | | not null |
>> l_linenumber | integer | | not null |
>> l_quantity | numeric(15,2) | | not null |
>> l_extendedprice | numeric(15,2) | | not null |
>> l_discount | numeric(15,2) | | not null |
>> l_tax | numeric(15,2) | | not null |
>> l_returnflag | character(1) | | not null |
>> l_linestatus | character(1) | | not null |
>> l_shipdate | date | | not null |
>> l_commitdate | date | | not null |
>> l_receiptdate | date | | not null |
>> l_shipinstruct | character(25) | | not null |
>> l_shipmode | character(10) | | not null |
>> l_comment | character varying(44) | | not null |
>> Indexes:
>> "i_l_commitdate" btree (l_commitdate)
>> "i_l_orderkey" btree (l_orderkey)
>> "i_l_orderkey_quantity" btree (l_orderkey, l_quantity)
>> "i_l_partkey" btree (l_partkey)
>> "i_l_receiptdate" btree (l_receiptdate)
>> "i_l_shipdate" btree (l_shipdate)
>> "i_l_suppkey" btree (l_suppkey)
>> "i_l_suppkey_partkey" btree (l_partkey, l_suppkey)
>>
>> tpch=# SELECT relname, relpages, reltuples, relallvisible, relkind,
>> relnatts, relhassubclass, reloptions, pg_table_size(oid) FROM pg_class
>> WHERE relname='lineitem';
>> relname | relpages | reltuples | relallvisible | relkind | relnatts
>> | relhassubclass | reloptions | pg_table_size
>>
>> ----------+----------+--------------+---------------+---------+----------+----------------+------------+---------------
>> lineitem | 112503 | 6.001167e+06 | 112503 | r | 16
>> | f | | 921903104
>> (1 row)
>>
>> Postgresql 12.0 and 12.1 are all manually installed from source.
>> Both are running on Ubuntu 16.04 kernel 4.4.0-142-generic, on Intel(R)
>> Core(TM) i7-6700K.
>>
>>
>> Any help greatly appreciated!
>>
>> Shijia
>>
>

--
*Shijia Wei*
ECE, UT Austin | ACSES | 3rd Year PhD
shijiawei(at)utexas(dot)edu | https://0x161e-swei.github.io

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Olivier Gautherot 2019-12-16 10:17:50 Re: Consecutive Query Executions with Increasing Execution Time
Previous Message Olivier Gautherot 2019-12-16 08:03:54 Re: Consecutive Query Executions with Increasing Execution Time