citext performance

From: Deepak Somaiya <deepsom(at)yahoo(dot)com>
To: "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>, "david(at)kineticode(dot)com" <david(at)kineticode(dot)com>
Subject: citext performance
Date: 2018-04-06 16:51:14
Message-ID: 1991635447.2385019.1523033474018@mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Folks, I read following (PostgreSQL: Documentation: 9.6: citext) and it does not hold true in my testing.. i.e citext is not performing better than lower.Am I missing something? help is appreciated.

|
|
|
| | |

|

|
|
| |
PostgreSQL: Documentation: 9.6: citext

|

|

|

"citext is not as efficient as text because the operator functions and the B-tree comparison functions must make copies of the data and convert it to lower case for comparisons. It is, however, slightly more efficient than using lower to get case-insensitive matching."

Here is what I have done 
drop table test;drop table testci;
CREATE TABLE test (id INTEGER PRIMARY KEY,name character varying(254));CREATE TABLE testci (id INTEGER PRIMARY KEY,name citext
);
INSERT INTO test(id, name)SELECT generate_series(1000001,2000000), (md5(random()::text));
INSERT INTO testci(id, name)SELECT generate_series(1,1000000), (md5(random()::text));

Now, I have done sequential search
explain (analyze on, format yaml) select * from test where lower(name)=lower('f6d7d5be1d0bed1cca11540d3a2667de');
- Plan:     Node Type: "Seq Scan"    Parallel Aware: false    Relation Name: "test"    Alias: "test"    Startup Cost: 0.00    Total Cost: 23334.00    Plan Rows: 5000    Plan Width: 37    Actual Startup Time: 0.016    Actual Total Time: 680.199    Actual Rows: 1    Actual Loops: 1    Filter: "(lower((name)::text) = 'f6d7d5be1d0bed1cca11540d3a2667de'::text)"    Rows Removed by Filter: 999999  Planning Time: 0.045  Triggers:   Execution Time: 680.213

explain (analyze on, format yaml) select * from testci where name='956d692092f0b9f85f36bf2b2501f3ad';
- Plan:     Node Type: "Seq Scan"    Parallel Aware: false    Relation Name: "testci"    Alias: "testci"    Startup Cost: 0.00    Total Cost: 20834.00    Plan Rows: 1    Plan Width: 37    Actual Startup Time: 0.017    Actual Total Time: 1184.485    Actual Rows: 1    Actual Loops: 1    Filter: "(name = '956d692092f0b9f85f36bf2b2501f3ad'::citext)"    Rows Removed by Filter: 999999  Planning Time: 0.029  Triggers:   Execution Time: 1184.496

You can see sequential searches with lower working twice as fast as citext.
Now I added index on citext and equivalent functional index (lower) on text.

CREATE INDEX textlowerindex ON test (lower(name));
create index textindex on test(name);

Index creation took longer with citext v/s creating lower functional index.

Now here comes execution with indexes
explain (analyze on, format yaml) select * from test where lower(name)=lower('f6d7d5be1d0bed1cca11540d3a2667de');

- Plan:     Node Type: "Bitmap Heap Scan"    Parallel Aware: false    Relation Name: "test"    Alias: "test"    Startup Cost: 187.18    Total Cost: 7809.06    Plan Rows: 5000    Plan Width: 37    Actual Startup Time: 0.020    Actual Total Time: 0.020    Actual Rows: 1    Actual Loops: 1    Recheck Cond: "(lower((name)::text) = 'f6d7d5be1d0bed1cca11540d3a2667de'::text)"    Rows Removed by Index Recheck: 0    Exact Heap Blocks: 1    Lossy Heap Blocks: 0    Plans:       - Node Type: "Bitmap Index Scan"        Parent Relationship: "Outer"        Parallel Aware: false        Index Name: "textlowerindex"        Startup Cost: 0.00        Total Cost: 185.93        Plan Rows: 5000        Plan Width: 0        Actual Startup Time: 0.016        Actual Total Time: 0.016        Actual Rows: 1        Actual Loops: 1        Index Cond: "(lower((name)::text) = 'f6d7d5be1d0bed1cca11540d3a2667de'::text)"  Planning Time: 0.051  Triggers:   Execution Time: 0.035

explain (analyze on, format yaml) select * from test where lower(name)=lower('f6d7d5be1d0bed1cca11540d3a2667de');

- Plan:     Node Type: "Index Scan"    Parallel Aware: false    Scan Direction: "Forward"    Index Name: "citextindex"    Relation Name: "testci"    Alias: "testci"    Startup Cost: 0.42    Total Cost: 8.44    Plan Rows: 1    Plan Width: 37    Actual Startup Time: 0.049    Actual Total Time: 0.050    Actual Rows: 1    Actual Loops: 1    Index Cond: "(name = '956d692092f0b9f85f36bf2b2501f3ad'::citext)"    Rows Removed by Index Recheck: 0  Planning Time: 0.051  Triggers:   Execution Time: 0.064

Deepak

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Nandakumar M 2018-04-08 10:12:46 Re: citext performance
Previous Message Adam Brusselback 2018-04-06 16:30:52 [PERFORM] Dissuade the use of exclusion constraint index