Re: Processor usage/tuning question

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Israel Brewster <israel(at)ravnalaska(dot)net>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Subject: Re: Processor usage/tuning question
Date: 2014-10-10 21:04:46
Message-ID: 543849EE.9060900@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 10/8/14, 3:17 PM, Israel Brewster wrote:
> Except that the last data point received is still valid as the aircraft's current location, even if it came in several hours ago - which we may well care about. That is, some users may need to see where a given aircraft (or the entire fleet) is, even if an aircraft hasn't updated in a while. That said, I did discuss this with my higher-ups, and got the ok to take it down to four hours.

Note that in your explain output nothing is filtering by time at all; are you sure you posted the right explain?

>> I don't think PostgreSQL is going to be able to reason very effectively about a ROW_NUMBER() in a inner table and then a row<=5 in the outer one being equivalent to a LIMIT query for which it could walk an index and then stopping once it finds 5 of them.
>>
>> Does this need to issued as a single query? Why not issue 55 different queries? It seems like the client is likely going to need to pick the returned list back out by tail number anyway, so both the client and the server might be happier with separate queries.
>
> Good point. Doing that brought the entire execution time down to around 60ms. Just ever so slightly better than the ~1200ms I was getting before. :-) I just have an unreasonable aversion to making multiple queries. I guess this is a prime example of why I need to overcome that :-)

Do you actually need the last 5 points? If you could get away with just the most recent point, SELECT DISTINCT ON might do a better job of this in a single query.

As for the concern about issuing multiple queries, if you code this into a database function it should still be quite fast because there won't be any round-trip between your application and the database.

Something else to consider is having a second table that only keeps the last X aircraft positions. I would do this by duplicating every insert into that table via a trigger, and then have a separate process that ran once a minute to delete any records other than the newest X. Because that table would always be tiny queries against it should be blazing fast. Do note that you'll want to vacuum that table frequently, like right after each delete.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Stephen Davies 2014-10-11 00:19:04 Re: psql connection issue
Previous Message Jim Nasby 2014-10-10 20:44:18 Re: table versioning approach (not auditing)