From: | Hiroyuki Sato <hiroysato(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Andreas Kretschmer <andreas(at)a-kretschmer(dot)de>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: grep -f keyword data query |
Date: | 2015-12-29 15:11:00 |
Message-ID: | CA+Tq-Ro_CHUHUizX-PwmEij-dS0Leekd-6yxx2G9D6D63rb7kA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello Tom.
Thank you for replying.
This is Gin Index result.
It is slow too.
Best regards.
--
Hiroyuki Sato
1, create.sql
drop table if exists url_lists4;
create table url_lists4 (
id int not null primary key,
url text not null
);
--create index ix_url_url_lists4 on url_lists4(url);
create index ix_url_url_lists4 on url_lists4 using gin(url
gin_trgm_ops);
drop table if exists keywords4;
create table keywords4 (
id int not null primary key,
name varchar(40) not null,
url text not null
);
create index ix_url_keywords4 on keywords4(url);
create index ix_name_keywords4 on keywords4(name);
\copy url_lists4(id,url) from 'sample.txt' with delimiter ',';
\copy keywords4(id,name,url) from 'keyword.txt' with delimiter ',';
2, EXPLAIN
QUERY PLAN
------------------------------------------------------------------------------------------
Nested Loop (cost=22.55..433522.66 rows=12500000 width=57)
-> Seq Scan on keywords4 k (cost=0.00..104.50 rows=5000 width=28)
Filter: ((name)::text = 'esc_url'::text)
-> Bitmap Heap Scan on url_lists4 u (cost=22.55..61.68 rows=2500
width=57)
Recheck Cond: (url ~~ k.url)
-> Bitmap Index Scan on ix_url_url_lists4 (cost=0.00..21.92
rows=2500 width=0)
Index Cond: (url ~~ k.url)
(7 rows)
3, EXPLAIN ANALYZE
QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=22.55..433522.66 rows=12500000 width=57) (actual
time=7227.210..1753163.751 rows=4850 loops=1)
-> Seq Scan on keywords4 k (cost=0.00..104.50 rows=5000 width=28)
(actual time=0.035..16.577 rows=5000 loops=1)
Filter: ((name)::text = 'esc_url'::text)
-> Bitmap Heap Scan on url_lists4 u (cost=22.55..61.68 rows=2500
width=57) (actual time=350.625..350.626 rows=1 loops=5000)
Recheck Cond: (url ~~ k.url)
Rows Removed by Index Recheck: 0
Heap Blocks: exact=159
-> Bitmap Index Scan on ix_url_url_lists4 (cost=0.00..21.92
rows=2500 width=0) (actual time=350.618..350.618 rows=1 loops=5000)
Index Cond: (url ~~ k.url)
Planning time: 0.169 ms
Execution time: 1753165.329 ms
(11 rows)
2015年12月29日(火) 2:34 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Hiroyuki Sato <hiroysato(at)gmail(dot)com> writes:
> > I re-created index with pg_trgm.
> > Execution time is 210sec.
> > Yes It is faster than btree index. But still slow.
> > It is possible to improve this query speed?
> > Should I use another query or idex?
>
> Did you try a GIN index?
>
> regards, tom lane
>
From | Date | Subject | |
---|---|---|---|
Next Message | Hiroyuki Sato | 2015-12-29 15:21:28 | Re: grep -f keyword data query |
Previous Message | Michael Rasmussen | 2015-12-28 23:59:55 | Re: plpgsql multidimensional array assignment results in array of text instead of subarrays |