BUG #8620: SELECT on Materialized View Fails to Use Index

From: jesse(dot)denardo(at)myfarms(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #8620: SELECT on Materialized View Fails to Use Index
Date: 2013-11-23 14:19:55
Message-ID: E1VkE3v-0006jl-7n@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 8620
Logged by: Jesse Denardo
Email address: jesse(dot)denardo(at)myfarms(dot)com
PostgreSQL version: 9.3.1
Operating system: Arch Linux x86_64 3.11.6-1-ARCH
Description:

I've got two seemingly identical servers running on two different machines.
Both are running Postgres 9.3.1, PostGIS 2.1.0, and both on Arch Linux.

On both servers I used this SQL script to create a materialized view and two
indexes on it:

CREATE MATERIALIZED VIEW gis.mat_land_boundaries AS (
SELECT 'US.' || id AS id
, ( SELECT id FROM i18n.countries WHERE code = 'US' ) AS
countries_id
, ST_SetSRID(geom,4326) AS geom
FROM gis.clu
UNION ALL
SELECT 'ZA.' || gid AS id
, ( SELECT id FROM i18n.countries WHERE code = 'ZA' ) AS
countries_id
, ST_SetSRID(geom,4326) AS geom
FROM gis.za_clu
);

CREATE UNIQUE INDEX mat_land_boundaries_id_unique
ON gis.mat_land_boundaries (id);

CREATE INDEX mat_land_boundaries_geom_gist
ON gis.mat_land_boundaries USING gist (geom);

This view is several gigabytes large and holds 28 million rows of geometry
data. I then run the following query on both servers:

explain analyze SELECT "id",
ST_AsGeoJSON(ST_SimplifyPreserveTopology(geom,0.00003)) as geometry FROM
"gis"."mat_land_boundaries" WHERE "geom" IS NOT NULL AND
ST_Intersects(geom,ST_SetSRID(ST_GeomFromText('POLYGON((-84.98954772949219
40.327701904195926, -84.98954772949219 40.362961345186555,
-84.89994049072266 40.362961345186555, -84.89994049072266
40.327701904195926, -84.98954772949219 40.327701904195926))'),4326)) LIMIT
800;

On one server, the query returns in a couple hundred milliseconds because it
used the mat_land_boundaries_geom_gist index:

QUERY PLAN




----------------------------------------------------------------------------------
----------------------------------------------------------------------------------
----------------------------------------------------------------------------------
-----------------
Limit (cost=0.42..1523.40 rows=116 width=536) (actual time=34.701..293.895
rows=
670 loops=1)
-> Index Scan using mat_land_boundaries_geom_gist on mat_land_boundaries
(cos
t=0.42..1523.40 rows=116 width=536) (actual time=34.700..293.720 rows=670
loops=1)
Index Cond: ((geom IS NOT NULL) AND (geom &&
'0103000020E6100000010000000
5000000000000C0543F55C0E1ADD022F2294440000000C0543F55C0EBA47184752E4440000000A0983
955C0EBA47184752E4440000000A0983955C0E1ADD022F2294440000000C0543F55C0E1ADD022F2294
440'::geometry))
Filter: _st_intersects(geom,
'0103000020E61000000100000005000000000000C05
43F55C0E1ADD022F2294440000000C0543F55C0EBA47184752E4440000000A0983955C0EBA47184752
E4440000000A0983955C0E1ADD022F2294440000000C0543F55C0E1ADD022F2294440'::geometry)
Rows Removed by Filter: 1
Total runtime: 294.120 ms
(6 rows)

The other server takes well over a minute because it does not use the
index:


QUERY PL
AN



-----------------------------------------------------------------------------------
-----------------------------------------------------------------------------------
-----------------------------------------------------------------------------------
-----------------------------------------------------------------------------------
-----------------------------------------------------------------------------------
-----------------------------------------------------------------------------
Limit (cost=0.00..4187.64 rows=800 width=64) (actual
time=23768.060..112201.963 r
ows=670 loops=1)
-> Seq Scan on mat_land_boundaries (cost=0.00..9898135.26 rows=1890924
width=6
4) (actual time=23768.057..112201.846 rows=670 loops=1)
Filter: ((geom IS NOT NULL) AND (geom &&
'0103000020E610000001000000050000
00000000C0543F55C0E1ADD022F2294440000000C0543F55C0EBA47184752E4440000000A0983955C0E
BA47184752E4440000000A0983955C0E1ADD022F2294440000000C0543F55C0E1ADD022F2294440'::g
eometry) AND _st_intersects(geom,
'0103000020E61000000100000005000000000000C0543F55
C0E1ADD022F2294440000000C0543F55C0EBA47184752E4440000000A0983955C0EBA47184752E44400
00000A0983955C0E1ADD022F2294440000000C0543F55C0E1ADD022F2294440'::geometry))
Rows Removed by Filter: 28505727
Total runtime: 112202.105 ms
(5 rows)

(Interestingly, the query returned one fewer row, even though both views
should contain identical data.) I've confirmed that the views were created
correctly, and the indexes were created. I tried dropping/recreating the
index and rebooting PostgreSQL and the entire box, but the behavior
persists. I've even ensured that the packages on both machines were all up
to date with identical versions. Any idea why Postgres would fail to use the
index on one machine but not the other, for identical queries?

# \d mat_land_boundaries
Materialized view "gis.mat_land_boundaries"
Column | Type | Modifiers
--------------+----------+-----------
id | text |
countries_id | integer |
geom | geometry |
Indexes:
"mat_land_boundaries_id_unique" UNIQUE, btree (id)
"mat_land_boundaries_geom_gist" gist (geom)

# \di mat_land_boundaries_geom_gist
List of relations
Schema | Name | Type | Owner | Table

--------+-------------------------------+-------+----------+---------------------
gis | mat_land_boundaries_geom_gist | index | postgres |
mat_land_boundaries
(1 row)

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message esh.debajit09 2013-11-23 15:06:28 BUG #8621: unable to run the stack wiard
Previous Message Peter Eisentraut 2013-11-23 13:04:51 Re: pgdump not dumping my database