Quick Links

Re: Weird performance issue with custom function with a for loop.

From:	Nicos Panayides <nicos(at)magneta(dot)com(dot)cy>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Weird performance issue with custom function with a for loop.
Date:	2011-02-01 18:50:22
Message-ID:	4D4855EE.2030703@magneta.com.cy
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On 02/01/2011 06:52 PM, Tom Lane wrote:
> Nicos Panayides<nicos(at)magneta(dot)com(dot)cy> writes:
>
>> On 01/31/2011 08:48 PM, Tom Lane wrote:
>>
>>> It seems likely that you're getting a different plan for the generic
>>> case because that user id isn't representative of the overall average
>>> for the column.
>>>
>
>> I tried the prepared statement with both $1 and 1811 for user_id and
>> here's the plans I got:
>>
> [ bad ]
>
>> " -> Seq Scan on game_round_actions (cost=0.00..51702078.26
>> rows=314 width=53)"
>> " Filter: ((action_time>= $2) AND (action_time<= $3) AND
>> (sub_action_id = 0) AND (user_id = $1))"
>>
> [ good ]
>
>> " -> Index Scan using i_session on game_round_actions
>> (cost=0.00..224166.97 rows=300 width=53)"
>> " Index Cond: ((action_time>= $2) AND (action_time<= $3))"
>> " Filter: (user_id = 1811)"
>>
> So the question is why it won't use that index in the parameterized case ...
>
>
>> CREATE INDEX i_session
>> ON game_round_actions
>> USING btree
>> (action_time)
>> WHERE user_id<> 0 AND sub_action_id = 0;
>>
> ... and the answer is that it can't prove user_id<> 0 when it doesn't
> know the value of the parameter equated to user_id, so it cannot build
> a plan that relies on using that partial index. (IOW, if it did use the
> index, it would get the wrong answer if $1 happened to be zero.)
>
> I don't know the reason you had for making the index partial in the
> first place, but maybe you should reconsider that. Another possibility
> is to explicitly include "user_id<> 0" in the query conditions, if
> you're certain that the passed-in value is never zero.
>
> regards, tom lane
>
I added an additional user_id <> 0 and performance is normal.
I didn't know that the database does not use the value of the parameter
to chose a plan. It makes sense now that
you have explained it.

The reason we use a partial index is that the table is very huge and we
are really interested only in those rows. Actually the condition should
have used user_id IS NOT NULL instead of <> 0.
Unfortunately the index takes quite a few hours to build so we
implemented the workaround you suggested for now. If the condition was
"user_id IS NOT NULL AND sub_action_id=0" would postgres always chose
the index since user_id = NULL will always be false?

Thank you very much for help!

--
Regards,

Nicos Panayides
IT Manager

Magneta Technologies Ltd
Tel: +357 22721919, 22317400
Fax: +357 22721917
Web: http://www.magneta.eu

In response to

Re: Weird performance issue with custom function with a for loop. at 2011-02-01 16:52:41 from Tom Lane

Browse pgsql-general by date

	From	Date	Subject
Next Message	Yngve Nysaeter Pettersen	2011-02-01 18:51:19	Re: Select for update with offset interferes with concurrent transactions
Previous Message	Andy Colson	2011-02-01 18:29:08	Re: Select for update with offset interferes with concurrent transactions