Best way to find last data sample before a given time

From: George Woodring <george(dot)woodring(at)iglass(dot)net>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Best way to find last data sample before a given time
Date: 2016-10-21 15:21:44
Message-ID: CACi+J=TGgxZoqHnhJQU-iB+8tZKY81FDBakfmyMsJN9D9rw=Vw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

We have a plpsql function that checks a threshold for a given data point
that seems to work well. The goal of the function is to find the last
sample before or equal to a given time.

The function runs the following sql command

SELECT outval >= 1875000000 FROM rrd.d_current
WHERE timeslot <= TO_TIMESTAMP(#)
AND pollgrpid = # AND dsnum = #
ORDER BY timeslot DESC limit 1;

It runs against an inherited table structure with the parent empty and 14
child tables ( one for each day ).

db=> \d d_current_20161020
Table "rrd.d_current_20161020"
Column | Type | Modifiers
-----------+--------------------------+-----------
timeslot | timestamp with time zone | not null
pollgrpid | integer | not null
dsnum | integer | not null
lasttime | timestamp with time zone |
lastval | double precision |
incount | integer | default 1
outval | double precision |
error | text |
Indexes:
"d_current_20161020_pkey_index" UNIQUE, btree (timeslot, pollgrpid,
dsnum)
"d_current_20161020_pollgrpid_index" btree (pollgrpid, dsnum)
Check constraints:
"d_current_20161020_timeslot_check" CHECK (timeslot >= '2016-10-20
00:00:00+00'::timestamp with time zone AND timeslot < '2016-10-21
00:00:00+00'::timestamp with time zone)
Inherits: d_current

Normally the query works like this:

db=> explain analyze SELECT outval >= 1875000000 FROM rrd.d_current WHERE
timeslot <= TO_TIMESTAMP(1476999600) AND pollgrpid = 497582 AND dsnum = 0
ORDER BY timeslot DESC limit 1;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------
Limit (cost=8.29..1709.85 rows=1 width=16) (actual
time=2604.674..2604.675 rows=1 loops=1)
-> Result (cost=8.29..7098909.56 rows=4172 width=16) (actual
time=2604.671..2604.671 rows=1 loops=1)
-> Merge Append (cost=8.29..7098899.13 rows=4172 width=16)
(actual time=2604.658..2604.658 rows=1 loops=1)
Sort Key: d_current.timeslot
-> Index Scan Backward using d_current_pkey_index on
d_current (cost=0.12..8.15 rows=1 width=16) (actual time=0.007..0.007 rows
=0 loops=1)
Index Cond: ((timeslot <= '2016-10-20
21:40:00+00'::timestamp with time zone) AND (pollgrpid = 497582) AND (dsnum
= 0))
-> Index Scan Backward using d_current_20161007_pkey_index
on d_current_20161007 (cost=0.56..507816.69 rows=298 width=16) (actu
al time=249.842..249.842 rows=1 loops=1)
Index Cond: ((timeslot <= '2016-10-20
21:40:00+00'::timestamp with time zone) AND (pollgrpid = 497582) AND (dsnum
= 0))
-> Index Scan Backward using d_current_20161008_pkey_index
on d_current_20161008 (cost=0.56..507610.93 rows=297 width=16) (actu
al time=5.688..5.688 rows=1 loops=1)
Index Cond: ((timeslot <= '2016-10-20
21:40:00+00'::timestamp with time zone) AND (pollgrpid = 497582) AND (dsnum
= 0))
-> Index Scan Backward using d_current_20161009_pkey_index
on d_current_20161009 (cost=0.56..507996.35 rows=304 width=16) (actu
al time=4.857..4.857 rows=1 loops=1)
Index Cond: ((timeslot <= '2016-10-20
21:40:00+00'::timestamp with time zone) AND (pollgrpid = 497582) AND (dsnum
= 0))
-> Index Scan Backward using d_current_20161010_pkey_index
on d_current_20161010 (cost=0.56..508675.73 rows=299 width=16) (actu
al time=1.546..1.546 rows=1 loops=1)
Index Cond: ((timeslot <= '2016-10-20
21:40:00+00'::timestamp with time zone) AND (pollgrpid = 497582) AND (dsnum
= 0))
-> Index Scan Backward using d_current_20161011_pkey_index
on d_current_20161011 (cost=0.56..510520.42 rows=301 width=16) (actu
al time=248.069..248.069 rows=1 loops=1)
Index Cond: ((timeslot <= '2016-10-20
21:40:00+00'::timestamp with time zone) AND (pollgrpid = 497582) AND (dsnum
= 0))
-> Index Scan Backward using d_current_20161012_pkey_index
on d_current_20161012 (cost=0.56..510777.81 rows=297 width=16) (actu
al time=218.087..218.087 rows=1 loops=1)
Index Cond: ((timeslot <= '2016-10-20
21:40:00+00'::timestamp with time zone) AND (pollgrpid = 497582) AND (dsnum
= 0))
-> Index Scan Backward using d_current_20161013_pkey_index
on d_current_20161013 (cost=0.56..511092.71 rows=302 width=16) (actu
al time=198.877..198.877 rows=1 loops=1)
Index Cond: ((timeslot <= '2016-10-20
21:40:00+00'::timestamp with time zone) AND (pollgrpid = 497582) AND (dsnum
= 0))
-> Index Scan Backward using d_current_20161014_pkey_index
on d_current_20161014 (cost=0.56..511168.98 rows=299 width=16) (actu
al time=233.877..233.877 rows=1 loops=1)
Index Cond: ((timeslot <= '2016-10-20
21:40:00+00'::timestamp with time zone) AND (pollgrpid = 497582) AND (dsnum
= 0))
-> Index Scan Backward using d_current_20161015_pkey_index
on d_current_20161015 (cost=0.56..511254.41 rows=302 width=16) (actu
al time=207.323..207.323 rows=1 loops=1)
Index Cond: ((timeslot <= '2016-10-20
21:40:00+00'::timestamp with time zone) AND (pollgrpid = 497582) AND (dsnum
= 0))
-> Index Scan Backward using d_current_20161016_pkey_index
on d_current_20161016 (cost=0.56..511334.88 rows=299 width=16) (actu
al time=231.319..231.319 rows=1 loops=1)
Index Cond: ((timeslot <= '2016-10-20
21:40:00+00'::timestamp with time zone) AND (pollgrpid = 497582) AND (dsnum
= 0))
-> Index Scan Backward using d_current_20161017_pkey_index
on d_current_20161017 (cost=0.56..511533.88 rows=304 width=16) (actu
al time=524.123..524.123 rows=1 loops=1)
Index Cond: ((timeslot <= '2016-10-20
21:40:00+00'::timestamp with time zone) AND (pollgrpid = 497582) AND (dsnum
= 0))
-> Index Scan Backward using d_current_20161018_pkey_index
on d_current_20161018 (cost=0.56..511763.48 rows=299 width=16) (actu
al time=232.597..232.597 rows=1 loops=1)
Index Cond: ((timeslot <= '2016-10-20
21:40:00+00'::timestamp with time zone) AND (pollgrpid = 497582) AND (dsnum
= 0))
-> Index Scan Backward using d_current_20161019_pkey_index
on d_current_20161019 (cost=0.56..511706.24 rows=303 width=16) (actu
al time=247.332..247.332 rows=1 loops=1)
Index Cond: ((timeslot <= '2016-10-20
21:40:00+00'::timestamp with time zone) AND (pollgrpid = 497582) AND (dsnum
= 0))
-> Index Scan Backward using d_current_20161020_pkey_index
on d_current_20161020 (cost=0.56..465464.76 rows=267 width=16) (actu
al time=1.095..1.095 rows=1 loops=1)
Index Cond: ((timeslot <= '2016-10-20
21:40:00+00'::timestamp with time zone) AND (pollgrpid = 497582) AND (dsnum
= 0))
Total runtime: 2604.818 ms
(35 rows)

My problem is that if I run this against a newly created pollgrpid/dsnum
pair, there is no data in the earlier tables and my guess is that the query
switches to an seqence scan because I cannot get the query to finish ( my
last attempt did not complete after waiting 10 minutes ).

Any suggestions would be appreciated

George Woodring
iGLASS Networks
www.iglass.net

Browse pgsql-general by date

  From Date Subject
Next Message Cachique 2016-10-21 15:30:37 Re: Doubts about replication from many servers
Previous Message Edilmar LISTAS 2016-10-21 15:09:02 Doubts about replication from many servers