One query run twice in parallel results in huge performance decrease

From: Jan Michel <Jan(at)mueschelsoft(dot)de>
To: pgsql-performance(at)postgresql(dot)org
Subject: One query run twice in parallel results in huge performance decrease
Date: 2013-11-29 22:07:59
Message-ID: 5299103F.1040307@mueschelsoft.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Dear all,
I have a quite strange problem running an extensive query on geo data
checking for crossing ways. I don't know if this is a postgres or postgis
problem, but I hope you can help. Running one thread is no problem, it
finishes within 10-15 minutes. Run two of those queries in parallel and
they will not finish within 24 hours. It is definitely not a caching or I/O
problem.

First, the environment:
Running on a large server (32 cores, 128 GB RAM, fast RAID disks)
I tested psql 8.1 / 9.1 / 9.3 and postgis 1.5 and 2.1.0 on Debian 6 and
OpenSuse 12.3. All behave similar. The pgsql server settings were
optimized using pgtune, wal logging and autovacuum is off.

I'm working on a set of databases, each 5-10 GB big filled with OSM
geo data. I run many different queries, and I know the server can handle
up to 8 parallel tasks without a decrease in performance compared to a
single thread. Most data is kept in the cache and almost no read access
to the disk needs to be done.
Everything works well, despite one query, that runs on a table with ~ 1M
entries. It searches for ways crossing each other:
http://etherpad.netluchs.de/pgquery
(The definition of the source table is included as well)

Here is the explain analyze of the query:
http://explain.depesz.com/s/fAcV
As you can see, the row estimate is far off, but the runtime of 11 minutes
is acceptable, I think.

When I run a second instance of this query in a unrelated database on the
same server, they take 100% CPU, no iowait and they do not finish even
after more than a day.
An explain done directly before executing the query shows a huge cost
estimate and varying different plans:
http://explain.depesz.com/s/XDR
http://explain.depesz.com/s/SeG

How can two queries have such a strong influence on each other? Especially
when the host server could handle even ten queries without problems?
And most important: What can I do?

Thank you all in advance for your help!
Jan

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2013-11-29 22:42:04 Re: One query run twice in parallel results in huge performance decrease
Previous Message Michael Paquier 2013-11-29 01:41:58 Re: Postgresql in a Virtual Machine