Re: UPDATE runs slow in a transaction

From: Viktor Rosenfeld <rosenfel(at)informatik(dot)hu-berlin(dot)de>
To: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: UPDATE runs slow in a transaction
Date: 2008-07-16 15:28:36
Message-ID: BE875FCD-4199-41F2-AE01-0269AB76F984@informatik.hu-berlin.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

I have no idea why the trigger constraints are called in the first
place since the respective columns are not touched in the query. Also
with the old correlated subquery these trigger constraints were not
called either.

Cheers,
Viktor

Am 16.07.2008 um 17:01 schrieb Pavel Stehule:

> hello
>
> second query:
>
> why without transactions are not called triggers constraint
> _FK_struct_2_collection and constraint _FK_struct_2_text?
>
> Regards
> Pavel Stehule
>
> 2008/7/16 Viktor Rosenfeld <rosenfel(at)informatik(dot)hu-berlin(dot)de>:
>> Hi Pavel,
>>
>> thanks for the advice on how to uncorrelate the query. I must
>> admit I
>> didn't know about the UPDATE ... SET ... FROM ... syntax.
>>
>> Now the UPDATE runs in an acceptable time inside a transaction,
>> however the
>> query plan still differs when I run it outside.
>>
>> Outside a transaction:
>>
>>
>> QUERY PLAN
>> --------------------------------------------------------------------------------------------------------------------------------------------
>> Merge Join (cost=0.00..11481.84 rows=65756 width=1087) (actual
>> time=0.151..323.856 rows=65756 loops=1)
>> Merge Cond: (_struct.id = tmp.id)
>> -> Index Scan using "_PK_struct" on _struct (cost=0.00..7084.50
>> rows=98149 width=1083) (actual time=0.028..137.463 rows=32300
>> loops=1)
>> -> Index Scan using idx_tmp__id on tmp (cost=0.00..3330.02
>> rows=65756
>> width=12) (actual time=0.115..58.601 rows=65756 loops=1)
>> Total runtime: 2905.580 ms
>>
>> This looks like an optimal plan and average run time over 5 runs is
>> 2660 ms.
>>
>> Inside a transaction:
>>
>>
>> QUERY PLAN
>> --------------------------------------------------------------------------------------------------------------------------------------------
>> Merge Join (cost=7427.63..16159.84 rows=65756 width=1087) (actual
>> time=315.570..574.075 rows=65756 loops=1)
>> Merge Cond: (_struct.id = tmp.id)
>> -> Index Scan using "_PK_struct" on _struct (cost=0.00..7500.50
>> rows=98149 width=1083) (actual time=0.020..129.915 rows=32300
>> loops=1)
>> -> Sort (cost=7427.63..7592.02 rows=65756 width=12) (actual
>> time=315.538..333.359 rows=65756 loops=1)
>> Sort Key: tmp.id
>> Sort Method: quicksort Memory: 4617kB
>> -> Seq Scan on tmp (cost=0.00..2165.56 rows=65756 width=12)
>> (actual time=10.070..37.411 rows=65756 loops=1)
>> Trigger for constraint _FK_struct_2_collection: time=1105.892
>> calls=32300
>> Trigger for constraint _FK_struct_2_text: time=1468.009 calls=32300
>> Total runtime: 4955.784 ms
>>
>> Again, the planner does not use the index on tmp (id) although I
>> put an
>> "ANALYZE tmp" right before the UPDATE. Average run time over 5
>> runs is 4610
>> ms.
>>
>> Thanks,
>> Viktor
>>
>> Am 16.07.2008 um 15:33 schrieb Pavel Stehule:
>>
>>> Hello
>>>
>>> my advice is little bit offtopic, I am sorry. Why you use correlated
>>> subquery? Your update statement should be
>>>
>>> update _struct set left_token = tmp.left_token from tmp where
>>> _struct.id = tmp.id;
>>>
>>> send output of explain analyze statement, please. etc
>>> explain analyze UPDATE _struct SET left_token = (SELECT DISTINCT
>>> left_token FROM tmp WHERE _struct.id = tmp.id)
>>>
>>> regards
>>> Pavel Stehule
>>>
>>>
>>>
>>> 2008/7/16 Viktor Rosenfeld <rosenfel(at)informatik(dot)hu-berlin(dot)de>:
>>>>
>>>> Hi Tom,
>>>> Postgres is indeed selecting a bad plan. Turns out that the
>>>> index I
>>>> created
>>>> to speed up the UPDATE isn't used inside a transaction block.
>>>> Here's the plan for "UPDATE _struct SET left_token = (SELECT
>>>> DISTINCT
>>>> left_token FROM tmp WHERE _struct.id = tmp.id)" outside of a
>>>> transaction:
>>>> QUERY PLAN
>>>>
>>>>
>>>> -------------------------------------------------------------------------------------------
>>>> Seq Scan on _struct (cost=0.00..826643.13 rows=98149 width=1083)
>>>> SubPlan
>>>> -> Unique (cost=8.38..8.40 rows=1 width=4)
>>>> -> Sort (cost=8.38..8.39 rows=4 width=4)
>>>> Sort Key: tmp.left_token
>>>> -> Index Scan using idx_tmp__id on tmp
>>>> (cost=0.00..8.34
>>>> rows=4 width=4)
>>>> Index Cond: ($0 = id)
>>>> And inside a transaction:
>>>> QUERY PLAN
>>>>
>>>>
>>>> ---------------------------------------------------------------------------------------
>>>> Seq Scan on _struct (cost=100000000.00..3230175260746.00
>>>> rows=32300
>>>> width=70)
>>>> SubPlan
>>>> -> Unique (cost=100002329.99..100002330.01 rows=1 width=4)
>>>> -> Sort (cost=100002329.99..100002330.00 rows=4 width=4)
>>>> Sort Key: tmp.left_token
>>>> -> Seq Scan on tmp (cost=100000000.00..100002329.95
>>>> rows=4 width=4)
>>>> Filter: ($0 = id)
>>>> The high cost of the seqscan on tmp are because I tried disabling
>>>> sequential
>>>> scans inside the transaction to force an index scan, which Postgres
>>>> decided
>>>> to ignore in this case.
>>>> Putting an ANALYZE tmp and ANALYZE _struct right before the
>>>> UPDATE didn't
>>>> help either. (Also shouldn't the creation of an index on tmp
>>>> (id) take
>>>> care
>>>> of analyzing that column?)
>>>> Thanks,
>>>> Viktor
>>>> Am 14.07.2008 um 20:52 schrieb Tom Lane:
>>>>
>>>> Viktor Rosenfeld <rosenfel(at)informatik(dot)hu-berlin(dot)de> writes:
>>>>
>>>> the script below runs very fast when executed alone. But when I
>>>> call
>>>>
>>>> it from within a transaction block it's so slow that I have to
>>>> abort
>>>>
>>>> it after a while. Specifically the second-to-last UPDATE seems to
>>>>
>>>> take forever within a transaction while it completes in about 3
>>>>
>>>> seconds outside a transaction.
>>>>
>>>> Since the table you're working on was just created in the same
>>>> transaction, there's been no opportunity for autovacuum to run an
>>>> ANALYZE on it; that's probably preventing selection of a good plan.
>>>> Try throwing in an "ANALYZE tmp" after you load the table.
>>>>
>>>> regards, tom lane
>>>>
>>>> --
>>>> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
>>>> To make changes to your subscription:
>>>> http://www.postgresql.org/mailpref/pgsql-general
>>>>
>>>>
>>
>>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Viktor Rosenfeld 2008-07-16 15:30:56 Re: UPDATE runs slow in a transaction
Previous Message Mason Hale 2008-07-16 15:26:01 Re: vacuum taking an unusually long time