Re: How to find first non-vacation day

From: Chris Browne <cbbrowne(at)acm(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: How to find first non-vacation day
Date: 2006-02-03 23:40:42
Message-ID: 607j8cj9j9.fsf@dba2.int.libertyrms.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

"Andrus Moor" <eetasoft(at)online(dot)ee> writes:

> I have a table of vacations
>
> create table vacation (
> id integer primary key,
> dstart date,
> dend date );
>
>
> I need to find first non-vacation day before given date.
>
> This can be done using the following procedural vfp code
>
> function nonvacation( dbefore )
>
> for i=dbefore to date(1960,1,1) step -1
> select vacation
> locate for between( i, dstart, dend )
> if not found()
> return i
> endif
> endfor
> return null
>
> but this is very slow
>
> How to implement this as sql select statement ?

People try to get baroquely clever about building tiny tables to
represent these things; it tends not to work out well, because the
queries get even more baroque...

I'd create a table of all the days of the year:

create table days (
a_day date,
primary key(a_day)
);

Fill it in with 365 values:

insert into days
select '2005-12-31'::date + (generate_series||'days')::interval from generate_series(1,365);

Suppose vacations are thus...

/* cbbrowne(at)[local]/dba2 ~=*/ select * from vacation;
dstart | dend
------------+------------
2006-01-01 | 2006-01-01
2006-03-15 | 2006-03-19
2006-12-24 | 2006-12-25
(3 rows)

Forget about your representation of vacation; replace it with the
following "set of vacation days":

create table vacation_days as select distinct a_day from vacation, days where a_day between dstart and dend;

Now, to find the *last working day* before, oh, say, 2006-03-18...

/* cbbrowne(at)[local]/dba2 ~=*/ select max(a_day) from (select a_day from days where a_day not in (select a_day from vacation_days)) as non_vac_days where a_day < '2006-03-18';
max
------------
2006-03-14
(1 row)

Determining cost...

/* cbbrowne(at)[local]/dba2 ~=*/ explain analyze select max(a_day) from (select a_day from days where a_day not in (select a_day from vacation_days)) as non_vac_days where a_day < '2006-03-18';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=5.43..5.44 rows=1 width=4) (actual time=0.644..0.647 rows=1 loops=1)
-> Index Scan using days_pkey on days (cost=1.10..5.33 rows=38 width=4) (actual time=0.112..0.406 rows=72 loops=1)
Index Cond: (a_day < '2006-03-18'::date)
Filter: (NOT (hashed subplan))
SubPlan
-> Seq Scan on vacation_days (cost=0.00..1.08 rows=8 width=4) (actual time=0.006..0.033 rows=8 loops=1)
Total runtime: 0.729 ms
(7 rows)

If you're doing a lot of calculations of "work days," then it would
make a lot of sense to create a "materialized calendar" representing
the work days of the year...

--- Start with all days
create table work_calendar as select a_day from days;
create unique index wc_day on work_calendar (a_day);
--- Drop out Saturday/Sunday
delete from work_calendar where date_part('dow', a_day) not in (0,6);
--- Drop out vacation days
delete from work_calendar where a_day in (select a_day from days, vacation where a_day between dstart and dend);

/* cbbrowne(at)[local]/dba2 ~=*/ select max(a_day) from work_calendar where a_day < '2006-03-18';
max
------------
2006-03-14
(1 row)

/* cbbrowne(at)[local]/dba2 ~=*/ explain analyze select max(a_day) from work_calendar where a_day < '2006-03-18';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.07..0.08 rows=1 width=0) (actual time=0.043..0.047 rows=1 loops=1)
InitPlan
-> Limit (cost=0.00..0.07 rows=1 width=4) (actual time=0.027..0.030 rows=1 loops=1)
-> Index Scan Backward using wc_day on work_calendar (cost=0.00..3.73 rows=54 width=4) (actual time=0.019..0.019 rows=1 loops=1)
Index Cond: (a_day < '2006-03-18'::date)
Filter: (a_day IS NOT NULL)
Total runtime: 0.101 ms
(7 rows)

The overall point: If you create the calendars as sets of days, then
SQL provides you with *excellent* ways of manipulating them as sets
where you say things like "where day is in this set" and "where day is
*not* in that set" and such.
--
"cbbrowne","@","acm.org"
http://cbbrowne.com/info/nonrdbms.html
"If the programmer can simulate a construct faster then the compiler
can implement the construct itself, then the compiler writer has blown
it badly." -- Guy L. Steele, Jr., Tartan Laboratories

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Chris Browne 2006-02-03 23:44:59 Re: How to find first non-vacation day
Previous Message Philippe Ferreira 2006-02-03 23:27:44 Bug with sequences and WAL ?