From: | Priyank Tiwari <priyankgt(at)gmail(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Any ideas how can I speed up this query? |
Date: | 2015-07-28 07:52:16 |
Message-ID: | CADbPz3G+rUaf3gDm5QQ97=38E2HNd619ZJ=9v61uY24p3P0xUA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi,
I have following table definition with 6209888 rows in it. It stores the
occurrences of species in various regions.
*TABLE DEFINITION*
Column | Type | Modifiers
--------------+------------------------+----------------------------------------------------------
id | integer | not null default
nextval('occurrences_id_seq'::regclass)
gbifid | integer | not null
sname | character varying(512) |
cname | character varying(512) |
species | character varying(512) |
location | geometry | not null
month | integer |
year | integer |
event_date | date |
dataset_key | character varying(512) |
taxon_key | character varying(512) |
taxon_rank | character varying(512) |
record_basis | character varying(512) |
category_id | integer |
country | character varying(512) |
lat | double precision |
lng | double precision |
Indexes:
"occurrences_pkey" PRIMARY KEY, btree (id)
"unique_occurrences_gbifid" UNIQUE, btree (gbifid)
"index_occurences_taxon_key" btree (taxon_key)
"index_occurrences_category_id" btree (category_id)
"index_occurrences_cname" btree (cname)
"index_occurrences_country" btree (country)
"index_occurrences_lat" btree (lat)
"index_occurrences_lng" btree (lng)
"index_occurrences_month" btree (month)
"index_occurrences_sname" btree (sname)
"occurrence_location_gix" gist (location)
I am trying to fetch the count of number of occurrences within a certain
region. I save the location of each occurrence as a geometric field as well
as lat, lng combination. Both fields are indexed. The query that is issued
is as follows.
*QUERY*
SELECT COUNT(*) FROM "occurrences" WHERE ("lat" >= -27.91550355958 AND
"lat" <= -27.015680440420002 AND "lng" >= 152.13307044728307 AND "lng" <=
153.03137355271693 AND "category_id" = 1 AND (ST_Intersects(
ST_Buffer(ST_PointFromText('POINT(152.582222 -27.465592)')::geography,
50000)::geography, location::geography)));
The problem is it takes more than acceptable time to execute the query.
Below is the explain analyze output for the same query.
*EXPLAIN ANALYZE QUERY OUTPUT (**http://explain.depesz.com/s/p2a
<http://explain.depesz.com/s/p2a>)*
Aggregate (cost=127736.06..127736.07 rows=1 width=0) (actual
time=13491.678..13491.679 rows=1 loops=1)
Buffers: shared hit=3 read=56025
-> Bitmap Heap Scan on occurrences (cost=28249.46..127731.08 rows=1995
width=0) (actual time=528.053..13388.458 rows=167511 loops=1)
Recheck Cond: ((lat >= (-27.91550355958)::double precision) AND
(lat <= (-27.01568044042)::double precision) AND (lng >=
152.133070447283::double precision) AND (lng <= 153.031373552717::double
precision))
Rows Removed by Index Recheck: 748669
Filter: ((category_id = 1) AND
('0103000020E6100000010000002100000090D8AD28D32263403905504558773BC0CADDAF0384226340E7AD43F4E38D3BC0B559D93A98216340B7BE554692A33BC0C904C18C18206340DF8EA9338DB73BC052F75181131E6340A1D9E30E0FC93BC00BDCB5E39C1B6340A1A40E496AD73BC074D30D03CD1863405DD7BF5110E23BC05DD3A2C0BF156340439B784797E83BC078E9287593126340EF9E5C37BEEA3BC072EB40B9670F63409E25A1BA6FE83BC06964481F5C0C6340B331B5D2C2E13BC08F6785ED8E0963409979FBF9F9D63BC0135DB3E71B0763402F78807480C83BC0E321E8351B0563405E96CB00E6B63BC0672CD874A00363403BC84B1BD9A23BC018FAE8F8B90263400314CD15208D3BC039DE8C4A70026340653B324F91763BC0F5BA5CDFC502634086322DDE0A603BC0E90E8A10B7036340C5FA1C046A4A3BC01ECCC14C3A056340CE4011BC82363BC022F38481400763407655DBB517253BC05B3B5AB6B50963404C9AA306D3163BC079EF01D3810C634010A732D03F0C3BC05152F188890F634044CE5D16C5053BC0EDDABA57AF126340926E14EFA1033BC08D7E9CA3D41563401EFB9E2DEB053BC071A929D5DA186340A3F1F29C8A0C3BC049A7E478A41B63404F8BD2CF3F173BC04B409855161E634057CC1080A2253BC0F9C65E70182063404BEB146926373BC0349D83F596216340449EEA7C204B3BC07803D7FD82226340A16F1747CD603BC090D8AD28D32263403905504558773BC0'::geography
&& (location)::geography) AND
(_st_distance('0103000020E6100000010000002100000090D8AD28D32263403905504558773BC0CADDAF0384226340E7AD43F4E38D3BC0B559D93A98216340B7BE554692A33BC0C904C18C18206340DF8EA9338DB73BC052F75181131E6340A1D9E30E0FC93BC00BDCB5E39C1B6340A1A40E496AD73BC074D30D03CD1863405DD7BF5110E23BC05DD3A2C0BF156340439B784797E83BC078E9287593126340EF9E5C37BEEA3BC072EB40B9670F63409E25A1BA6FE83BC06964481F5C0C6340B331B5D2C2E13BC08F6785ED8E0963409979FBF9F9D63BC0135DB3E71B0763402F78807480C83BC0E321E8351B0563405E96CB00E6B63BC0672CD874A00363403BC84B1BD9A23BC018FAE8F8B90263400314CD15208D3BC039DE8C4A70026340653B324F91763BC0F5BA5CDFC502634086322DDE0A603BC0E90E8A10B7036340C5FA1C046A4A3BC01ECCC14C3A056340CE4011BC82363BC022F38481400763407655DBB517253BC05B3B5AB6B50963404C9AA306D3163BC079EF01D3810C634010A732D03F0C3BC05152F188890F634044CE5D16C5053BC0EDDABA57AF126340926E14EFA1033BC08D7E9CA3D41563401EFB9E2DEB053BC071A929D5DA186340A3F1F29C8A0C3BC049A7E478A41B63404F8BD2CF3F173BC04B409855161E634057CC1080A2253BC0F9C65E70182063404BEB146926373BC0349D83F596216340449EEA7C204B3BC07803D7FD82226340A16F1747CD603BC090D8AD28D32263403905504558773BC0'::geography,
(location)::geography, 0::double precision, false) < 1e-05::double
precision))
Rows Removed by Filter: 6357
Heap Blocks: exact=29947 lossy=22601
Buffers: shared hit=3 read=56025
-> BitmapAnd (cost=28249.46..28249.46 rows=32476 width=0)
(actual time=519.091..519.091 rows=0 loops=1)
Buffers: shared read=3477
-> Bitmap Index Scan on index_occurrences_lat
(cost=0.00..11691.20 rows=365877 width=0) (actual time=218.999..218.999
rows=392415 loops=1)
Index Cond: ((lat >= (-27.91550355958)::double
precision) AND (lat <= (-27.01568044042)::double precision))
Buffers: shared read=1444
-> Bitmap Index Scan on index_occurrences_lng
(cost=0.00..16557.01 rows=517658 width=0) (actual time=285.211..285.211
rows=550523 loops=1)
Index Cond: ((lng >= 152.133070447283::double
precision) AND (lng <= 153.031373552717::double precision))
Buffers: shared read=2033
Planning time: 2.812 ms
Execution time: 13493.617 ms
(19 rows)
It seems that the planner is underestimating the number of rows
returned in Bitmap
Heap Scan on occurrences. I have run vacuum analyze on this table couple of
times, but it still produces the same result. Any idea how I can speed up
this query? How I can assist planner in providing better row estimates for
Bitmap Heap Scan section?
*POSTGRESQL VERSION INFO*
version
------------------------------------------------------------------------------------------------------
PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu
4.8.2-19ubuntu1) 4.8.2, 64-bit
*HARDWARE*
I am running the Postgresql instance on a digital ocean vm with 1 core, SSD
disk and 1 GB of ram.
Appreciate your help.
Thanks,
Priyank
From | Date | Subject | |
---|---|---|---|
Next Message | Graeme B. Bell | 2015-07-28 08:19:00 | Re: Any ideas how can I speed up this query? |
Previous Message | Jeff Janes | 2015-07-26 00:43:06 | Re: Are many idle connections bad? |