bad result from too large number of rows in imbricated requests

From: Éric Quinton <eric(dot)quinton(at)irstea(dot)fr>
To: pgsql-bugs(at)postgresql(dot)org
Subject: bad result from too large number of rows in imbricated requests
Date: 2014-10-29 08:47:37
Message-ID: 5450A9A9.4000908@irstea.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hello,

We have data from oxygen level in an estuary. Each 10', a value is
recorded.

Our table :
CREATE TABLE mesure
(
mesure_id serial NOT NULL,
station_id integer NOT NULL,
mesure_date timestamp NOT NULL,
temperature float4,
turbidite float4,
oxygene float4,
saturation_o2 float4,
salinite float4,
hauteur_eau float4,
conductivite float4,
oxygene_meth2 float4,
saturation_o2_meth2 float4
);

We want to calculate the duration of an event when level of oxygen is
under a value, like < 4, for a station.

I've write this code :

with minmax as (
select
m.mesure_date as date_debut,
(select max (m1.mesure_date) from mesure m1
where m1.mesure_date >= m.mesure_date and m1.mesure_date < (
select min(m2.mesure_date) from mesure m2
where m2.mesure_date > m.mesure_date and m2.oxygene >= 4 and
m2.station_id = m.station_id)
) as date_last

from mesure m
where m.station_id = 2
and m.oxygene < 4 and m.oxygene > 0
order by 1
)
select distinct on (date_last) date_debut, date_last,
(extract (minute from (date_last - date_debut))
+ extract(hour from (date_last - date_debut)) * 60
+ extract(day from (date_last - date_debut)) * 1440)
as nb_minute
from minmax
;

This request work on 11928 records (select count(*) from mesure m
where m.station_id = 2
and m.oxygene < 4 and m.oxygene > 0;)

But...
some records are missing :

date_debut date_last nb_minute
2006-07-12 22:07:00 2006-07-13 07:20:00 553
2006-07-13 07:32:00 2006-07-13 07:50:00 18
2006-07-13 08:54:00 2006-07-13 09:10:00 16
2006-07-13 09:24:00 2006-07-13 09:30:00 6
2006-07-13 10:04:00 2006-07-13 20:01:00 597
*2006-07-13 20:57:00 2006-07-13 21:42:00 45**
**2006-07-21 07:33:00 2006-07-24 01:52:00 3979*
2006-07-24 04:52:00 2006-07-24 06:44:00 112
2006-07-24 16:00:00 2006-07-24 18:07:00 127

Yet, with this request :

select
m.mesure_date as date_debut,
(select max (m1.mesure_date) from mesure m1
where m1.mesure_date >= m.mesure_date and m1.mesure_date < (
select min(m2.mesure_date) from mesure m2
where m2.mesure_date > m.mesure_date and m2.oxygene >= 4 and
m2.station_id = m.station_id)
) as date_last

from mesure m
where m.station_id = 2
and m.oxygene < 4 and m.oxygene > 0
*and m.mesure_date > '2006-01-01'*
order by 1
)
select distinct on (date_last) date_debut, date_last,
(extract (minute from (date_last - date_debut))
+ extract(hour from (date_last - date_debut)) * 60
+ extract(day from (date_last - date_debut)) * 1440)
as nb_minute
from minmax
;

date_debut date_last nb_minute
2006-07-12 08:14:00 2006-07-12 08:23:00 9
2006-07-12 10:04:00 2006-07-12 18:39:00 515
2006-07-12 22:07:00 2006-07-13 07:20:00 553
2006-07-13 07:32:00 2006-07-13 07:50:00 18
2006-07-13 08:54:00 2006-07-13 09:10:00 16
2006-07-13 09:24:00 2006-07-13 09:30:00 6
2006-07-13 10:04:00 2006-07-13 20:01:00 597
2006-07-13 20:57:00 2006-07-13 21:42:00 45
*2006-07-13 21:57:00 2006-07-24 01:52:00 14635*
2006-07-24 02:12:00 2006-07-24 06:44:00 272
2006-07-24 08:48:00 2006-07-24 13:56:00 308
2006-07-24 14:20:00 2006-07-24 14:36:00 16
2006-07-24 15:20:00 2006-07-24 18:07:00 167

the data are good.

The number of records is, in this last request, 9645.

In complement, for this request :
select mesure_id, mesure_date, station_id, oxygene
from mesure
where station_id = 2
and mesure_date >= '2006-07-12' and mesure_date <= '2006-07-24'
order by mesure_date;

2 files : raw_data.txt, and insert_query.txt.

Requests generate with SQL Workbench/J Build 115 (2013-09-01 11:26)
Server :
cat /etc/issue : Debian GNU/Linux 7
uname -r : 3.2.0-4-amd64
psql --version : psql (PostgreSQL) 9.3.5

Tanks,

--
Éric Quinton
Ingénieur en bases de données
Unité de recherche "Écosystèmes aquatiques et changements globaux" (EABX)
IRSTEA
50, avenue de Verdun-Gazinet
F-33612 Cestas Cedex
+33 (0)5 57 89 27 10
www.irstea.fr <http://www.irstea.fr/>

Pour mieux affirmer ses missions,
le Cemagref devient Irstea

Attachment Content-Type Size
insert_query.txt text/plain 250.2 KB
raw_data.txt text/plain 58.2 KB

Browse pgsql-bugs by date

  From Date Subject
Next Message chenhj 2014-10-29 09:09:01 psql treat backslash which is not the first character in the input line as meta-command
Previous Message Romu Hu 2014-10-29 06:40:14 Re: [BUGS] Need guidance on regression.diffs