Re: Generic Plans for Prepared Statement are 158155 times slower than Custom Plans

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Naik, Sameer" <Sameer_Naik(at)bmc(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Deepak Somaiya <deepsom(at)yahoo(dot)com>, "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>, "Bruce Momjian" <bruce(at)momjian(dot)us>, "david(at)kineticode(dot)com" <david(at)kineticode(dot)com>
Subject: Re: Generic Plans for Prepared Statement are 158155 times slower than Custom Plans
Date: 2019-05-23 17:17:30
Message-ID: 15960.1558631850@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

"Naik, Sameer" <Sameer_Naik(at)bmc(dot)com> writes:
> On Mon, May 20, 2019 at 09:37:34PM +0000, Deepak Somaiya wrote:
>> wow this is interesting!
>> @Tom, Bruce, David - Experts
>> Any idea why would changing the datatype would cause so much degradation - this is even when plan remains the same ,data is same.

I see nothing very exciting here. text equality comparison reduces to
a memcmp, while citext equality comparison is quite expensive, since
it has to case-fold both inputs before it can memcmp them.

For the given test case:

> ' -> Index Scan using i776_0_400129200_t776 on t776 (cost=0.42..12.66 rows=1 width=52) (actual time=1187.686..5531.421 rows=48 loops=1)'
> ' Index Cond: ((c400129200 = $1) AND (c400127400 = $2))'
> ' Filter: (((c400129100 <> $3) OR (c400129100 IS NULL)) AND ((c179 = $4) OR (c179 = $5) OR (c179 = $6) OR (c179 = $7) OR (c179 = $8) OR (c179 = $9) OR (c179 = $10) OR (c179 = $11) OR (c179 = $12) OR (c179 = $13) OR (c179 = $14) OR (c179 = $15) OR (c179 = $16) OR (c179 = $17) OR (c179 = $18) OR (c179 = $19) OR (c179 = $20) OR (c179 = $21) OR (c179 = $22) OR (c179 = $23) OR (c179 = $24) OR (c179 = $25) OR (c179 = $26) OR (c179 = $27) OR (c179 = $28) OR (c179 = $29) OR (c179 = $30) OR (c179 = $31) OR (c179 = $32) OR (c179 = $33) OR (c179 = $34) OR (c179 = $35) OR (c179 = $36) OR (c179 = $37) OR (c179 = $38) OR (c179 = $39) OR (c179 = $40) OR (c179 = $41) OR (c179 = $42) OR (c179 = $43) OR (c179 = $44) OR (c179 = $45) OR (c179 = $46) OR (c179 = $47) OR (c179 = $48) OR (c179 = $49) OR (c179 = $50) OR (c179 = $51)))'
> ' Rows Removed by Filter: 55322'

it's reasonable to suppose that not many of the rows are failing the
c400129100 conditions, so that in order to decide that a row doesn't
pass the filter, we are forced to perform each of the OR'd c179
comparisons. So this query did something like 48 * 55322 equality
comparisons for c179. If the cost of a citexteq evaluation is
around 2 microseconds, that'd fully explain the runtime differential.

The OP didn't say what locale or encoding he's using. Maybe switching
to some other settings would improve matters ... though if non-ASCII
case folding is a business requirement, that likely won't go far.

Or you could get rid of the need for the repetitive case-folding,
say by storing lower(c179) in a separate column and doing plain
text comparisons to pre-lowercased input values.

regards, tom lane

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Pavel Stehule 2019-05-24 17:12:13 Re: [HACKERS] proposal: schema variables
Previous Message Stephen Frost 2019-05-23 15:08:35 Re: upgrade to PG11 on secondary fails (no initdb was launched)