Optimize a big matrix select

From: Florian Schaetz <Florian(dot)Schaetz(at)Optitool(dot)DE>
To: "pgsql-novice(at)postgresql(dot)org" <pgsql-novice(at)postgresql(dot)org>
Subject: Optimize a big matrix select
Date: 2015-03-27 10:06:23
Message-ID: 01E7E8E50C98824E9045DFD5FFC859D403405B5F8E00@SBSRV.ot.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Hello together,

my name is Florian (ok, that much was obvious) and I'm using PostgreSQL as well as MySQL for a caching solution on a routing server. One thing I am always trying to do is optimize the db times, so here I go again...

I have a table myTable with many rows (millions, potentially billions). This table looks approx. like this...

- attribute_id
- source
- destination
- payload1
- payload2
- payload3
All of these columns are integers. The primary key is (attribute_id, source, destination), but there are also single indexes on the first three columns.

What I want to do is to get the "matrix" for n source/destination points, for example if I had the points 1,2, then I would want the payload for...

source = 1, destination = 1
source = 1, destination = 2
source = 2, destination = 1
source = 2, destination = 2

So, my query, using attribute_1 = 1 as an example, looks like this:

select c.source, c.destination, c.payload1, c.payload2, c.payload3 from myTable
inner join TEMP_TABLE t1 ON t1.point = c.source
inner join TEMP_TABLE t2 ON t2.point = c.destination
where c.attribute_id = 1;

The TEMP_TABLE simply contains the list of integers as "point" (with an index).

With the temp table is filled with around 1700 numers, the result will be approx. 2.9mil entries (1700 * 1700). Unfortunately, it's quite slow (10-15secs). With MySQL, the main problem is the fetch, while the query itself only takes milliseconds, but I don't know where the problem with PostgreSQL is (if any). But with only selecting count(*) the whole thing is much faster (1.3 seconds), so I guess the fetch part is the problem here, too.

I have already tried more indices, sub-queries, IN, = ANY, etc. - nothing seems to speed it up.

Does anyone see a chance to speed this query up significantly?

Regards,

Flo

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Kevin Grittner 2015-03-27 14:47:20 Re: Optimize a big matrix select
Previous Message Ravi Kiran 2015-03-27 09:44:17 Implementing a join algorithm in Postgres