From: | Florian Schaetz <Florian(dot)Schaetz(at)Optitool(dot)DE> |
---|---|
To: | "pgsql-novice(at)postgresql(dot)org" <pgsql-novice(at)postgresql(dot)org> |
Subject: | Optimize a big matrix select |
Date: | 2015-03-27 10:06:23 |
Message-ID: | 01E7E8E50C98824E9045DFD5FFC859D403405B5F8E00@SBSRV.ot.local |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-novice |
Hello together,
my name is Florian (ok, that much was obvious) and I'm using PostgreSQL as well as MySQL for a caching solution on a routing server. One thing I am always trying to do is optimize the db times, so here I go again...
I have a table myTable with many rows (millions, potentially billions). This table looks approx. like this...
- attribute_id
- source
- destination
- payload1
- payload2
- payload3
All of these columns are integers. The primary key is (attribute_id, source, destination), but there are also single indexes on the first three columns.
What I want to do is to get the "matrix" for n source/destination points, for example if I had the points 1,2, then I would want the payload for...
source = 1, destination = 1
source = 1, destination = 2
source = 2, destination = 1
source = 2, destination = 2
So, my query, using attribute_1 = 1 as an example, looks like this:
select c.source, c.destination, c.payload1, c.payload2, c.payload3 from myTable
inner join TEMP_TABLE t1 ON t1.point = c.source
inner join TEMP_TABLE t2 ON t2.point = c.destination
where c.attribute_id = 1;
The TEMP_TABLE simply contains the list of integers as "point" (with an index).
With the temp table is filled with around 1700 numers, the result will be approx. 2.9mil entries (1700 * 1700). Unfortunately, it's quite slow (10-15secs). With MySQL, the main problem is the fetch, while the query itself only takes milliseconds, but I don't know where the problem with PostgreSQL is (if any). But with only selecting count(*) the whole thing is much faster (1.3 seconds), so I guess the fetch part is the problem here, too.
I have already tried more indices, sub-queries, IN, = ANY, etc. - nothing seems to speed it up.
Does anyone see a chance to speed this query up significantly?
Regards,
Flo
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2015-03-27 14:47:20 | Re: Optimize a big matrix select |
Previous Message | Ravi Kiran | 2015-03-27 09:44:17 | Implementing a join algorithm in Postgres |