Re: Consecutive Query Executions with Increasing Execution Time

From: Olivier Gautherot <ogautherot(at)gautherot(dot)net>
To: Shijia Wei <shijiawei(at)utexas(dot)edu>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Consecutive Query Executions with Increasing Execution Time
Date: 2019-12-16 10:17:50
Message-ID: CAJ7S9TVc2vNPWZb6wKR2UmiUTheyGdJjexvcxinZw16D2uHj7w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Shijia,

If you're using fish, I suspect you're on a Mac - I don't have experience
on this platform.

Can you check with pgAdmin (3 or 4) what the server is busy doing after a
few iterations? Check for locks, as it could be a cause. Also, do you have
concurrent INSERTs?

Olivier

On Mon, Dec 16, 2019, 10:52 Shijia Wei <shijiawei(at)utexas(dot)edu> wrote:

> Hi Olivier,
>
> I do not think that the queries are executed concurrently. The bash for
> loop ensures that the next command fires only after the first returns.
> Also for some 'complex' queries, even a wait-period that is longer than
> the total execution time does not completely avoid this effect.
> For example, a wait-period of 5-second in between queries that take
> 2-second to run, does not help avoid the increasing runtime problem
> completely.
>
> Thanks,
> Shijia
>
>
> On Mon, Dec 16, 2019 at 2:04 AM Olivier Gautherot <
> ogautherot(at)gautherot(dot)net> wrote:
>
>> Hi Shijia,
>>
>> It sounds like concurrency on the queries: the second starts before the
>> first ends, and so on. With a short wait in between you ensure sequential
>> execution. Notice that you also have the overhead of concurrent psql...
>>
>> Sounds normal to me.
>>
>> Best regards
>> Olivier
>>
>>
>> On Mon, Dec 16, 2019, 07:00 Shijia Wei <shijiawei(at)utexas(dot)edu> wrote:
>>
>>> Hi all,
>>>
>>> I am running TPC-H on recent postgresql (12.0 and 12.1).
>>> On some of the queries (that may involve parallel scans) I see this
>>> interesting behavior:
>>> When these queries are executed back-to-back (sent from psql
>>> interactive terminal), the total execution time of them
>>> increase monotonically.
>>>
>>> I simplified query-1 to demonstrate this effect:
>>> ``` example.sql
>>> explain (analyze, buffers) select
>>> max(l_shipdate) as max_data,
>>> count(*) as count_order
>>> from
>>> lineitem
>>> where
>>> l_shipdate <= date '1998-12-01' - interval '20' day;
>>> ```
>>>
>>> When I execute (from fish) following command:
>>> `for i in (seq 1 20); psql tpch < example.sql | grep Execution; end`
>>> The results I get are as follows:
>>> "
>>> Execution Time: 184.864 ms
>>> Execution Time: 192.758 ms
>>> Execution Time: 197.380 ms
>>> Execution Time: 200.384 ms
>>> Execution Time: 202.950 ms
>>> Execution Time: 205.695 ms
>>> Execution Time: 208.082 ms
>>> Execution Time: 209.108 ms
>>> Execution Time: 212.428 ms
>>> Execution Time: 214.539 ms
>>> Execution Time: 215.799 ms
>>> Execution Time: 219.057 ms
>>> Execution Time: 222.102 ms
>>> Execution Time: 223.779 ms
>>> Execution Time: 227.819 ms
>>> Execution Time: 229.710 ms
>>> Execution Time: 239.439 ms
>>> Execution Time: 237.649 ms
>>> Execution Time: 249.178 ms
>>> Execution Time: 261.268 ms
>>> "
>>> In addition, if the repeated more times, the total execution time can
>>> end up being 10X and more!!!
>>>
>>> When there a wait period in-between queries, (e.g. sleep 10) in the
>>> above for loop, this increasing execution time behavior goes a way.
>>> For more complex queries, the "wait period" needs to be longer to avoid
>>> the increase in execution time.
>>>
>>> Some metadata about this table "lineitem":
>>> tpch=# \d lineitem
>>> Table "public.lineitem"
>>> Column | Type | Collation | Nullable |
>>> Default
>>>
>>> -----------------+-----------------------+-----------+----------+---------
>>> l_orderkey | integer | | not null |
>>> l_partkey | integer | | not null |
>>> l_suppkey | integer | | not null |
>>> l_linenumber | integer | | not null |
>>> l_quantity | numeric(15,2) | | not null |
>>> l_extendedprice | numeric(15,2) | | not null |
>>> l_discount | numeric(15,2) | | not null |
>>> l_tax | numeric(15,2) | | not null |
>>> l_returnflag | character(1) | | not null |
>>> l_linestatus | character(1) | | not null |
>>> l_shipdate | date | | not null |
>>> l_commitdate | date | | not null |
>>> l_receiptdate | date | | not null |
>>> l_shipinstruct | character(25) | | not null |
>>> l_shipmode | character(10) | | not null |
>>> l_comment | character varying(44) | | not null |
>>> Indexes:
>>> "i_l_commitdate" btree (l_commitdate)
>>> "i_l_orderkey" btree (l_orderkey)
>>> "i_l_orderkey_quantity" btree (l_orderkey, l_quantity)
>>> "i_l_partkey" btree (l_partkey)
>>> "i_l_receiptdate" btree (l_receiptdate)
>>> "i_l_shipdate" btree (l_shipdate)
>>> "i_l_suppkey" btree (l_suppkey)
>>> "i_l_suppkey_partkey" btree (l_partkey, l_suppkey)
>>>
>>> tpch=# SELECT relname, relpages, reltuples, relallvisible, relkind,
>>> relnatts, relhassubclass, reloptions, pg_table_size(oid) FROM pg_class
>>> WHERE relname='lineitem';
>>> relname | relpages | reltuples | relallvisible | relkind | relnatts
>>> | relhassubclass | reloptions | pg_table_size
>>>
>>> ----------+----------+--------------+---------------+---------+----------+----------------+------------+---------------
>>> lineitem | 112503 | 6.001167e+06 | 112503 | r |
>>> 16 | f | | 921903104
>>> (1 row)
>>>
>>> Postgresql 12.0 and 12.1 are all manually installed from source.
>>> Both are running on Ubuntu 16.04 kernel 4.4.0-142-generic, on Intel(R)
>>> Core(TM) i7-6700K.
>>>
>>>
>>> Any help greatly appreciated!
>>>
>>> Shijia
>>>
>>
>
> --
> *Shijia Wei*
> ECE, UT Austin | ACSES | 3rd Year PhD
> shijiawei(at)utexas(dot)edu | https://0x161e-swei.github.io
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Mariel Cherkassky 2019-12-16 11:48:20 Re: performance degredation after upgrade from 9.6 to 12
Previous Message Shijia Wei 2019-12-16 09:51:24 Re: Consecutive Query Executions with Increasing Execution Time