Re: Performance issues

From: Andreas Forø Tollefsen <andreasft(at)gmail(dot)com>
To: Andy Colson <andy(at)squeakycode(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Kenneth Marshall <ktm(at)rice(dot)edu>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance issues
Date: 2011-03-08 17:00:15
Message-ID: AANLkTi=DYWXkrs5zqCwMF6w1n0vbh=hK85iH8xtc+j0A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Forgot to mention that the query terminates the connection because of a
crash of server process.

2011/3/8 Andreas Forø Tollefsen <andreasft(at)gmail(dot)com>

> Andy. Thanks. That is a great tips. I tried it but i get the error:
> NOTICE: ptarray_simplify returned a <2 pts array.
>
> Query:
> SELECT ST_Intersection(priogrid_land.cell,
> ST_Simplify(cshapeswdate.geom,0.1)) AS geom,
> priogrid_land.gid AS divider, gwcode, gwsyear, gweyear, startdate, enddate,
> capname, caplong, caplat, col, row, xcoord, ycoord
> FROM priogrid_land, cshapeswdate WHERE ST_Intersects(priogrid_land.cell,
> ST_Simplify(cshapeswdate.geom,0.1)) AND cshapeswdate.gwsyear <=1946 AND
> cshapeswdate.gweyear >=1946 AND cshapeswdate.startdate <= '1946/1/1';
>
>
> 2011/3/8 Andy Colson <andy(at)squeakycode(dot)net>
>
> I have seen really complex geometries cause problems. If you have
>> thousands of points, when 10 would do, try ST_Simplify and see if it doesnt
>> speed things up.
>>
>> -Andy
>>
>>
>>
>> On 3/8/2011 2:42 AM, Andreas Forř Tollefsen wrote:
>>
>>> Hi. Thanks for the comments. My data is right, and the result is exactly
>>> what i want, but as you say i think what causes the query to be slow is
>>> the ST_Intersection which creates the intersection between the vector
>>> grid (fishnet) and the country polygons.
>>> I will check with the postgis user list if they have any idea on how to
>>> speed up this query.
>>>
>>> Best,
>>> Andreas
>>>
>>> 2011/3/8 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>
>>>
>>>
>>> =?ISO-8859-1?Q?Andreas_For=F8_Tollefsen?= <andreasft(at)gmail(dot)com
>>> <mailto:andreasft(at)gmail(dot)com>> writes:
>>> > This is a query i am working on now. It creates an intersection
>>> of two
>>> > geometries. One is a grid of 0.5 x 0.5 decimal degree sized
>>> cells, while the
>>> > other is the country geometries of all countries in the world for
>>> a certain
>>> > year.
>>>
>>> Hm, are you sure your data is right? Because the actual rowcounts
>>> imply
>>> that each country intersects about half of the grid cells, which
>>> doesn't
>>> seem right.
>>>
>>> > priogrid=# EXPLAIN ANALYZE SELECT priogrid_land.gid, gwcode,
>>> > ST_Intersection(pri
>>> > ogrid_land.cell, cshapeswdate.geom) FROM priogrid_land,
>>> cshapeswdate WHERE
>>> > ST_Intersects(priogrid_land.cell, cshapeswdate.geom);
>>> >
>>> QUERY
>>> > PLAN
>>>
>>> >
>>>
>>> --------------------------------------------------------------------------------
>>> > ------------------------------------------------------------------
>>> > Nested Loop (cost=0.00..12644.85 rows=43351 width=87704) (actual
>>> > time=1.815..7
>>> > 074973.711 rows=130331 loops=1)
>>> > Join Filter: _st_intersects(priogrid_land.cell,
>>> cshapeswdate.geom)
>>> > -> Seq Scan on cshapeswdate (cost=0.00..14.42 rows=242
>>> width=87248)
>>> > (actual
>>> > time=0.007..0.570 rows=242 loops=1)
>>> > -> Index Scan using idx_priogrid_land_cell on priogrid_land
>>> > (cost=0.00..7.1
>>> > 5 rows=1 width=456) (actual time=0.069..5.604 rows=978 loops=242)
>>> > Index Cond: (priogrid_land.cell && cshapeswdate.geom)
>>> > Total runtime: 7075188.549 ms
>>> > (6 rows)
>>>
>>> AFAICT, all of the runtime is going into calculating the ST_Intersects
>>> and/or ST_Intersection functions. The two scans are only accounting
>>> for
>>> perhaps 5.5 seconds, and the join infrastructure isn't going to be
>>> terribly expensive, so it's got to be those functions. Not knowing
>>> much
>>> about PostGIS, I don't know if the functions themselves can be
>>> expected
>>> to be really slow. If it's not them, it could be the cost of fetching
>>> their arguments --- in particular, I bet the country outlines are very
>>> large objects and are toasted out-of-line. There's been some past
>>> discussion of automatically avoiding repeated detoastings in scenarios
>>> like the above, but nothing's gotten to the point of acceptance yet.
>>> Possibly you could do something to force detoasting in a subquery.
>>>
>>> regards, tom lane
>>>
>>>
>>>
>>
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Mindaugas Riauba 2011-03-08 18:27:52 Re: Linux I/O schedulers - CFQ & random seeks
Previous Message Andreas Forø Tollefsen 2011-03-08 16:58:58 Re: Performance issues