Re: Planner performance extremely affected by an hanging transaction (20-30 times)?

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Bartłomiej Romański <br(at)sentia(dot)pl>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Planner performance extremely affected by an hanging transaction (20-30 times)?
Date: 2013-09-20 22:01:33
Message-ID: CAMkU=1wjnoFOrzVqLpUg2Ja=+kd_sv2HP4+cF-Y7r4a3uHT-tg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Sep 19, 2013 at 5:49 PM, Bartłomiej Romański <br(at)sentia(dot)pl> wrote:

>
> Finally, we created a python script to make simple queries in a loop:
>
> while True:
> id = random.randrange(1, 1000 * 1000)
> db.execute('select offers.id, offers.name, categories.id,
> categories.name from offers left join categories on categories.id =
> offers.category_id where offers.id = %s', (id,))
> print db.fetchall()
>
> We start 20 instances simultaneously and measure performance:
>
> parallel -j 20 ./test.py -- $(seq 1 20) | pv -i1 -l > /dev/null
>
> Normally we observe about 30k QPS what's a satisfying result (without any
> tuning at all).
>
> The problem occurs when we open a second console, start psql and type:
>
> pgtest=> begin; insert into categories (name) select 'category_' || x from
> generate_series(1,1000) as x;
>

Related topics have been discussed recently, but without much apparent
resolution.

See "In progress INSERT wrecks plans on table" and "Performance bug in
prepared statement binding in 9.2" also on this list

The issues are:

1) The planner actually queries the relation to find the end points of the
variable ranges, rather than using potentially out-of-date statistics.

2) When doing so, it needs to wade through the "in-progress" rows, figuring
out whether the owning transaction is still in progress or has already
committed or aborted. If the owning transaction *has* committed or rolled
back, then it can set hint bits so that future executions don't need to do
this. But if the owning transaction is still open, then the querying
transaction has done the work, but is not able to set any hint bits so
other executions also need to do the work, repeatedly until the other
transactions finishes.

3) Even worse, asking if a given transaction has finished yet can be a
serious point of system-wide contention, because it takes the
ProcArrayLock, once per row which needs to be checked. So you have 20
processes all fighting over the ProcArrayLock, each doing so 1000 times per
query.

One idea (from Simon, I think) was to remember that a transaction was just
checked and was in progress, and not checking it again for future rows. In
the future the transaction might have committed, but since it would have
committed after we took the snapshot, thinking it is still in progress
would not be a correctness problem, it would just needlessly delay setting
the hint bits.

Another idea was not to check if it were in progress at all, because if it
is in the snapshot it doesn't matter if it is still in progress. This
would a slightly more aggressive way to delay setting the hint bit (but
also delay doing the work needed to figure out how to set them).

Items 2 and 3 and can also arise in situations other than paired with 1.

Cheers,

Jeff

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jesper Krogh 2013-09-21 05:08:20 Re: Planner performance extremely affected by an hanging transaction (20-30 times)?
Previous Message David Whittaker 2013-09-20 18:11:12 Re: Intermittent hangs with 9.2