Re: Searching in varchar column having 100M records

From: mayank rupareliya <mayankjr03(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Searching in varchar column having 100M records
Date: 2019-07-19 14:13:26
Message-ID: CACX+qbynqp2nN3S8VQAEx9rpbQ-4xs-WVpuSe_aPreiYA3OXmw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Well, you haven't shown us the execution plan, so it's hard to check why
it did not help much and give you further advice.

This is the latest query execution with explain after adding indexing on
both columns.

Aggregate (cost=174173.57..174173.58 rows=1 width=8) (actual
time=65087.657..65087.658 rows=1 loops=1)
-> Bitmap Heap Scan on fields (cost=1382.56..174042.61 rows=52386
width=0) (actual time=160.340..65024.533 rows=31857 loops=1)
Recheck Cond: ((field)::text = 'Champlin'::text)
Heap Blocks: exact=31433
-> Bitmap Index Scan on index_field (cost=0.00..1369.46
rows=52386 width=0) (actual time=125.078..125.079 rows=31857 loops=1)
Index Cond: ((field)::text = 'Champlin'::text)
Planning Time: 8.595 ms
Execution Time: 65093.508 ms

On Thu, Jul 18, 2019 at 6:11 PM Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:

> On Thu, Jul 18, 2019 at 05:21:49PM +0530, mayank rupareliya wrote:
> >*Please recheck with track_io_timing = on in configuration. explain
> >(analyze,buffers) with this option will report how many time we spend
> >during i/o*
> >
> >*> Buffers: shared hit=2 read=31492*
> >
> >*31492 blocks / 65 sec ~ 480 IOPS, not bad if you are using HDD*
> >
> >*Your query reads table data from disks (well, or from OS cache). You need
> >more RAM for shared_buffers or disks with better performance.*
> >
> >
> >Thanks Sergei..
> >*track_io_timing = on helps.. Following is the result after changing that
> >config.*
> >
> >Aggregate (cost=10075.78..10075.79 rows=1 width=8) (actual
> >time=63088.198..63088.199 rows=1 loops=1)
> > Buffers: shared read=31089
> > I/O Timings: read=61334.252
> > -> Bitmap Heap Scan on fields (cost=72.61..10069.32 rows=2586 width=0)
> >(actual time=69.509..63021.448 rows=31414 loops=1)
> > Recheck Cond: ((field)::text = 'Klein'::text)
> > Heap Blocks: exact=30999
> > Buffers: shared read=31089
> > I/O Timings: read=61334.252
> > -> Bitmap Index Scan on index_field (cost=0.00..71.96 rows=2586
> >width=0) (actual time=58.671..58.671 rows=31414 loops=1)
> > Index Cond: ((field)::text = 'Klein'::text)
> > Buffers: shared read=90
> > I/O Timings: read=45.316
> >Planning Time: 66.435 ms
> >Execution Time: 63088.774 ms
> >
>
> How did that help? It only gives you insight that it's really the I/O that
> takes time. You need to reduce that, somehow.
>
> >
> >*So try something like*
> >
> >* CREATE INDEX ios_idx ON table (field, user_id);*
> >
> >*and make sure the table is vacuumed often enough (so that the visibility*
> >*map is up to date).*
> >
> >Thanks Tomas... I tried this and result improved but not much.
> >
>
> Well, you haven't shown us the execution plan, so it's hard to check why
> it did not help much and give you further advice.
>
> regards
>
> --
> Tomas Vondra http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Michael Lewis 2019-07-19 17:33:41 Re: Searching in varchar column having 100M records
Previous Message Andres Freund 2019-07-18 22:23:21 Re: Perplexing, regular decline in performance