From: | Oleg Bartunov <obartunov(at)gmail(dot)com> |
---|---|
To: | Aaron Lewis <the(dot)warl0ck(dot)1989(at)gmail(dot)com> |
Cc: | Postgres General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Trigram is slow when 10m rows |
Date: | 2016-11-13 19:02:00 |
Message-ID: | CAF4Au4wGcFynSqS2hXNkh53zG=U06Dy_Nw8eJV3Rk72Oqbwzgg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Sun, Nov 13, 2016 at 5:25 PM, Aaron Lewis <the(dot)warl0ck(dot)1989(at)gmail(dot)com>
wrote:
> Thanks Oleg.
>
> I've increased work_mem to 128MB, now the query falls down to 1.7s,
> faster but still not good enough.
>
> Is there any other thing I can do about it?
>
your query 'x264' is short in terms of the number of trigrams, so trigram
index isn't good. Did you tried text_pattern_ops for btree ? Something like
create index title_btree_idx on mytable using btree(title text_pattern_ops
);
>
> test=# explain analyze select * from mytable where title ilike 'x264';
> QUERY PLAN
> ------------------------------------------------------------
> --------------------------------------------------------------------------
> Bitmap Heap Scan on mytable (cost=462.69..5639.67 rows=1380
> width=83) (actual time=1754.656..1754.656 rows=0 loops=1)
> Recheck Cond: (title ~~* 'x264'::text)
> Rows Removed by Index Recheck: 1220793
> Heap Blocks: exact=197567
> -> Bitmap Index Scan on title_trgm_idx (cost=0.00..462.35
> rows=1380 width=0) (actual time=346.663..346.663 rows=1220793 loops=1)
> Index Cond: (title ~~* 'x264'::text)
> Planning time: 1.168 ms
> Execution time: 1755.944 ms
>
>
> On Sun, Nov 13, 2016 at 10:04 PM, Oleg Bartunov <obartunov(at)gmail(dot)com>
> wrote:
> >
> >
> > On Sun, Nov 13, 2016 at 2:54 PM, Aaron Lewis <the(dot)warl0ck(dot)1989(at)gmail(dot)com
> >
> > wrote:
> >>
> >> I have a simple table with Trigram index,
> >>
> >> create table mytable(hash char(40), title text);
> >> create index title_trgm_idx on mytable using gin(title gin_trgm_ops);
> >>
> >> When I run a query with 10m rows, it uses the Trigram index, but takes
> >> 3s to execute, very slow.
> >> (I have 80m rows, but only inserted 10m for testing purpose)
> >>
> >> test=# select count(*) from mytable;
> >> count
> >> ----------
> >> 13971887
> >> (1 row)
> >>
> >> test=# explain analyze select * from mytable where title ilike 'x264';
> >> QUERY PLAN
> >>
> >> ------------------------------------------------------------
> --------------------------------------------------------------------------
> >> Bitmap Heap Scan on mytable (cost=462.69..5639.67 rows=1380
> >> width=83) (actual time=2937.308..2937.308 rows=0 loops=1)
> >> Recheck Cond: (title ~~* 'x264'::text)
> >> Rows Removed by Index Recheck: 11402855
> >> Heap Blocks: exact=39557 lossy=158010
> >> -> Bitmap Index Scan on title_trgm_idx (cost=0.00..462.35
> >> rows=1380 width=0) (actual time=342.440..342.440 rows=1220793 loops=1)
> >> Index Cond: (title ~~* 'x264'::text)
> >> Planning time: 0.611 ms
> >> Execution time: 2937.729 ms
> >> (8 rows)
> >>
> >> Any ideas to speed things up?
> >
> >
> > Rows Removed by Index Recheck: 11402855
> > Heap Blocks: exact=39557 lossy=158010
> >
> > You need to increase work_mem
> >>
> >>
> >> --
> >> Best Regards,
> >> Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/
> >> Finger Print: 9F67 391B B770 8FF6 99DC D92D 87F6 2602 1371 4D33
> >>
> >>
> >> --
> >> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> >> To make changes to your subscription:
> >> http://www.postgresql.org/mailpref/pgsql-general
> >
> >
>
>
>
> --
> Best Regards,
> Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/
> Finger Print: 9F67 391B B770 8FF6 99DC D92D 87F6 2602 1371 4D33
>
From | Date | Subject | |
---|---|---|---|
Next Message | aws backup | 2016-11-13 21:01:57 | Re: pg_dumpall: could not connect to database "template1": FATAL: |
Previous Message | Tom Lane | 2016-11-13 18:51:44 | Re: Why is this query not using GIN index? |