From: | balasubramanian c r <crbs(dot)siebel(at)gmail(dot)com> |
---|---|
To: | pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | GIN INdex is not used with && operator for a text array index |
Date: | 2023-12-06 14:32:52 |
Message-ID: | CANnzXMPHcGyv1v5J+jcJyP6hUwd2O7ZLA6yjQJxx332Ou+m63w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
HI Team
Sorry for the spam.
We have Postgres DB where the list of addresses are stored and for a given
complete address
trigram of addresses are stored in a column which is a text array.
after looking at the list of operators that are available for gin index I
decided to use array_ops operator.
select amop.amopopr::regoperator, amop.amopstrategy, opc.opcname from
pg_opclass opc, pg_opfamily opf, pg_am am, pg_amop amop where opf.oid =
opc.opcfamily and am.oid = opf.opfmethod and amop.amopfamily =
opc.opcfamily and am.amname = 'gin' and amop.amoplefttype = opc.opcintype;
amopopr | amopstrategy | opcname
-------------------------+--------------+----------------
&&(anyarray,anyarray) | 1 | array_ops
@>(anyarray,anyarray) | 2 | array_ops
<@(anyarray,anyarray) | 3 | array_ops
=(anyarray,anyarray) | 4 | array_ops
CREATE INDEX pentgram_idx ON address18 USING GIN(pentgram array_ops);
when Operator '@>' is used the index is used and the execution time is 60ms.
db=# EXPLAIN (costs, buffers, verbose, analyze) select similarity('jattan
kangra riyali 144 hp 176058', complete_address) from address18 where
pentgram @> show_trgm('jattan kangra riyali 144 hp 176058');
Bitmap Heap Scan on public.address18 (cost=261.25..262.52 rows=1 width=4)
(actual time=58.992..58.994 rows=1 loops=1)
Output: similarity('*****'::text, complete_address)
Recheck Cond: (address18.pentgram @> '{******}'::text[])
Heap Blocks: exact=1
Buffers: shared hit=1483
-> Bitmap Index Scan on pentgram_idx (cost=0.00..261.25 rows=1
width=0) (actual time=58.960..58.960 rows=1 loops=1)
Index Cond: (address18.pentgram @> '{***}'::text[])
Buffers: shared hit=1482
Query Identifier: -126591413296272164
Planning:
Buffers: shared hit=1
Planning Time: 0.679 ms
Execution Time: 60.373 ms
(13 rows)
when Operator '&&' is used the index is used and the execution time is 60ms.
It is performing sequential scan which is not expected.
db=# EXPLAIN (costs, buffers, verbose, analyze) select similarity('jattan
kangra riyali 144 hp 176058', complete_address) from address18 where
pentgram && show_trgm('jattan kangra riyali 144 hp 176058');
Seq Scan on public.address18 (cost=0.00..77215.11 rows=247741 width=4)
(actual time=0.063..1880.467 rows=247741 loops=1)
Output: similarity('****'::text, complete_address)
Filter: (address18.pentgram && '{"****}'::text[])
Buffers: shared hit=3592 read=69907
I/O Timings: shared/local read=267.274
Query Identifier: 2367846469053211383
Planning:
Buffers: shared hit=1
Planning Time: 0.386 ms
Execution Time: 1886.125 ms
(10 rows)
When i disable the sequential scan the execution time is increased
significantly.
db=# EXPLAIN (costs, buffers, verbose, analyze) select similarity('jattan
kangra riyali 144 hp 176058', complete_address) from address18 where
pentgram && show_trgm('jattan kangra riyali 144 hp 176058');
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on public.address18 (cost=2266.24..79481.36 rows=247741
width=4) (actual time=186.402..3285.090 rows=247741 loops=1)
Output: similarity('****'::text, complete_address)
Recheck Cond: (address18.pentgram && '{***********}'::text[])
Heap Blocks: exact=39632
Buffers: shared hit=1361 read=39155 written=6260
I/O Timings: shared/local read=981.127 write=544.491
-> Bitmap Index Scan on pentgram_idx (cost=0.00..2204.31 rows=247741
width=0) (actual time=182.462..182.463 rows=247741 loops=1)
Index Cond: (address18.pentgram && '{*****}'::text[])
Buffers: shared hit=884
Query Identifier: 2367846469053211383
Planning:
Buffers: shared hit=1
Planning Time: 6.707 ms
Execution Time: 3292.339 ms
(14 rows)
Not expecting this behavior currently.
Few parameters in my configuration
postgresql version is 15
OS is RHEL 8.8
effective_cache_size | 131072
huge_pages | off
maintenance_work_mem | 327680
max_parallel_maintenance_workers | 1
max_parallel_workers | 2
max_parallel_workers_per_gather | 1
shared_buffers
| 32768
Thanks
C.R.Bala
From | Date | Subject | |
---|---|---|---|
Next Message | Ron Johnson | 2023-12-06 14:43:27 | Re: Store PDF files in PostgreDB |
Previous Message | Joe Conway | 2023-12-06 13:49:13 | Re: Emitting JSON to file using COPY TO |