How to improve this query?

From: Jorge Arévalo <jorge(dot)arevalo(at)deimos-space(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: How to improve this query?
Date: 2011-02-07 22:12:18
Message-ID: AANLkTinc=3K7MqB7WzRzzs8o9f2rcj3_Fg02iuyjUWWr@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello,

update TABLE_A set INT_FIELD = TABLE_B.INT_FIELD from TABLE_C, TABLE_B
where TABLE_B.STR_FIELD = TABLE_C.STR_FIELD and TABLE_A.INT_FIELD2 >=
TABLE_B.min and TABLE_A.INT_FIELD2 <= TABLE_B.max and
st_intersects(TABLE_A.wkb_geometry, TABLE_C.the_geom);

Tables description:

++++++++++++ TABLE A

CREATE TABLE TABLE_A
(
ogc_fid serial NOT NULL,
wkb_geometry geometry,
INT_FIELD2 integer,
INT_FIELD integer NOT NULL DEFAULT 0,
CONSTRAINT TABLE_A_pk PRIMARY KEY (ogc_fid),
CONSTRAINT enforce_dims_wkb_geometry CHECK (st_ndims(wkb_geometry) = 2),
CONSTRAINT enforce_srid_wkb_geometry CHECK (st_srid(wkb_geometry) = 23030)
)
WITH (
OIDS=FALSE
);

CREATE INDEX TABLE_A_geom_idx
ON TABLE_A
USING gist
(wkb_geometry);

CREATE INDEX TABLE_A_INT_FIELD2
ON TABLE_A
USING btree
(INT_FIELD2);

+++++++++++++++++++

++++++++++++ TABLE B

CREATE TABLE TABLE_B
(
STR_FIELD character(50) DEFAULT NULL::bpchar,
min integer NOT NULL DEFAULT 0,
max integer NOT NULL DEFAULT 0,
INT_FIELD integer NOT NULL DEFAULT 0,
oid integer NOT NULL DEFAULT 0,
CONSTRAINT TABLE_B_pk PRIMARY KEY (oid)
)
WITH (
OIDS=FALSE
);

CREATE INDEX TABLE_B_idx
ON TABLE_B
USING btree
(STR_FIELD);

CREATE INDEX TABLE_B_max_idx
ON TABLE_B
USING btree
(max);

CREATE INDEX TABLE_B_min_idx
ON TABLE_B
USING btree
(min);

+++++++++++++++++++

++++++++++++ TABLE C

CREATE TABLE TABLE_C
(
the_geom geometry,
STR_FIELD character(50)
)
WITH (
OIDS=FALSE
);

CREATE INDEX TABLE_C_index
ON TABLE_C
USING gist
(the_geom);

CREATE INDEX TABLE_C_string_idx
ON TABLE_C
USING btree
(STR_FIELD);

+++++++++++++++++++

Tables data:

- TABLE_A: 896888 entries. The geometries are single polygons
(squares, actually), coordinates are floating point numbers
- TABLE_B: 88 entries.
- TABLE C: 69352 entries. Geometries are single polygons too, but much
bigger than the polygons from TABLE_A.

As you can see in the query, I'm interested in the polygons of TABLE_A
that intersects the big polygons in TABLE_C.

Query plan (explain <query> output):

"Hash Join (cost=3.98..986808.75 rows=209049 width=497)"
" Hash Cond: (TABLE_C.STR_FIELD = TABLE_B.STR_FIELD)"
" Join Filter: ((TABLE_A.INT_FIELD2 >= TABLE_B.min) AND
(TABLE_A.INT_FIELD2 <= TABLE_B.max))"
" -> Nested Loop (cost=0.00..955055.47 rows=470360 width=543)"
" Join Filter: _st_intersects(TABLE_A.wkb_geometry, TABLE_C.the_geom)"
" -> Index Scan using TABLE_C_string_idx on TABLE_C
(cost=0.00..8044.56 rows=69352 width=1517)"
" -> Index Scan using TABLE_A_geom_idx on TABLE_A
(cost=0.00..12.61 rows=4 width=493)"
" Index Cond: (TABLE_A.wkb_geometry && TABLE_C.the_geom)"
" -> Hash (cost=2.88..2.88 rows=88 width=63)"
" -> Seq Scan on TABLE_B (cost=0.00..2.88 rows=88 width=63)"

With that information, how could I make the update faster? It takes
more than 24hours to finish.

Many thanks in advance

Best regards,

--
Jorge Arévalo
Internet & Mobilty Division, DEIMOS
jorge(dot)arevalo(at)deimos-space(dot)com
http://es.linkedin.com/in/jorgearevalo80
http://mobility.grupodeimos.com/
http://www.twitter.com/jorgeas80
http://gis4free.wordpress.com
http://geohash.org/ezjqgrgzz0g

Browse pgsql-general by date

  From Date Subject
Next Message Jens Sauer 2011-02-07 23:16:34 Re: fulltext search and hunspell
Previous Message Edoardo Panfili 2011-02-07 22:06:23 Re: SELECT INTO array[i] with PL/pgSQL