Re: Help on indexing timestamps

From: Andre Schubert <andre(at)km3(dot)de>
To: Achilleus Mantzios <achill(at)matrix(dot)gatewaynet(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Help on indexing timestamps
Date: 2003-03-07 13:11:44
Message-ID: 20030307141144.76b65a14.andre@km3.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Fri, 7 Mar 2003 15:03:01 -0200 (GMT+2)
Achilleus Mantzios <achill(at)matrix(dot)gatewaynet(dot)com> wrote:

> On Fri, 7 Mar 2003, Andre Schubert wrote:
>
> > On Fri, 7 Mar 2003 14:17:36 -0200 (GMT+2)
> > Achilleus Mantzios <achill(at)matrix(dot)gatewaynet(dot)com> wrote:
> >
> > > On Fri, 7 Mar 2003, Andre Schubert wrote:
> > >
> > > > On Fri, 7 Mar 2003 13:48:04 -0200 (GMT+2)
> > > > Achilleus Mantzios <achill(at)matrix(dot)gatewaynet(dot)com> wrote:
> > > >
> > > > > On Fri, 7 Mar 2003, Andre Schubert wrote:
> > > > >
> > > > > > On Fri, 7 Mar 2003 13:01:16 -0200 (GMT+2)
> > > > > > Achilleus Mantzios <achill(at)matrix(dot)gatewaynet(dot)com> wrote:
> > > > > >
> > > > > > > On Fri, 7 Mar 2003, Andre Schubert wrote:
> > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > i have a little problem on indexing a table which contains
> > > > > > > > about 4 millions of traffic-data.
> > > > > > > > My problem is, that a want to select all data from
> > > > > > > > a specific month from a specific ip and this select should use the index.
> > > > > > > > I use the following select:
> > > > > > >
> > > > > > > Did you try to use BETWEEN ??
> > > > > > > E.g.
> > > > > > > ... and time_stamp between '2003-01-01 00:00:00'::timestamp and
> > > > > > > '2003-02-01 00:00:00'::timestamp
> > > > > > >
> > > > > >
> > > > > > Yes and it works if i write the dates by hand, every new month.
> > > > > > But the query is executed automatically and i dont want
> > > > > > to write in the dates before the query is executed. Maybe the
> > > > > > the start and enddate should also be alculated with sql,
> > > > > > because i want to create a view from this statement and execute it every month.
> > > > > > Or did i miss something.
> > > > >
> > > > > You could have an index on the
> > > > > whole
> > > > > date_trunc('month',tbl_traffic.time_stamp),ip
> > > > >
> > > > > How does it perform?
> > > > >
> > > >
> > > > I'am not sure how to create such an index...
> > > >
> > > > First: create or replace function trunc_ip(timestamp with time zone) returns timestamptz as
> > > > 'select date_trunc(''month'',$1)' language 'sql' with (iscachable);
> > > > Then: create index idx_test on tbl_traffic using btree( trunc(time_stamp) );
> > > >
> > > > Result: db_km3aue=# explain analyze select sum(inet_up+inet_down) from tbl_traffic where trunc(tbl_traffic.time_stamp) = trunc('2003-02-01'::timestamptz) and ip = '80.243.40.56';
> > > > NOTICE: QUERY PLAN:
> > > >
> > > > Aggregate (cost=108.78..108.78 rows=1 width=16) (actual time=2278.48..2278.48 rows=1 loops=1)
> > > > -> Index Scan using idx_test on tbl_traffic (cost=0.00..108.78 rows=1 width=16) (actual time=0.23..2240.50 rows=5346 loops=1)
> > > > Total runtime: 2278.62 msec
> > > >
> > > > Maybe the problem is, that the index is created without ip as the second column....
> > >
> > > Sorry, it sliped my mind that we cannot have compound indexes on
> > > functions. :(
> > >
> > > Anyway.
> > > Did the explicit BETWEEN gave you satisfactory performance?
> > >
> >
> > Yes if i use BETWEEN and type in the dates by hand the query takes about 200ms with explain analyze.
> > I think these is a good performance.
> > But if i use now() instead of manually typed dates the query take about 1400ms :(
> > I thought the somebody posted to this list, that now() is a function that is not cached,
> > and thatswhy does not work pretty well with indexes.
> > I created a cached function cached_now() which returns now() but is declared with "isCacheable".
> > If i use cached_now() instead of now the query takes also about 200ms :)
> > Is it safe, or better could it have any side-effects, if i use my "own" cached_now() in such a query ?
>
> I think you should use a high level language to wrap your PostgreSQL
> access (e.g. java, c, perl, ....).
>
> Since you are gonna be running this query as a batch job,
> you must use a high level lang that can handle dates in
> a UNIX-like way. (milisecs from 1970-01-01 00:00:00)
>

Ok, i think you are right.
Thanks for the quick advice.

Regards, as

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Christoph Haller 2003-03-07 13:15:32 Re: problem with subselect
Previous Message Mathieu Arnold 2003-03-07 13:04:42 problem with subselect