Re: BUG #13908: Query returns too few rows

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Seth P <seth-p(at)outlook(dot)com>
Subject: Re: BUG #13908: Query returns too few rows
Date: 2016-02-04 22:13:00
Message-ID: CAKFQuwZcE4x=2ZvFpiuTagdwLACuTtTVk_NxZed3NUOKe7ziQQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Tom, we really need you to chime in here.

On Thu, Feb 4, 2016 at 2:58 PM, David G. Johnston <
david(dot)g(dot)johnston(at)gmail(dot)com> wrote:

> Adding back the list so that everyone can see the latest finding. Adding
> in the original bug block too...
> ​
>
> The following bug has been logged on the website:
>
> Bug reference: 13908
> Logged by: Seth
> Email address: seth-p(at)outlook(dot)com
> PostgreSQL version: 9.5.0
> Operating system: Windows 10
> Description:
>
>
> * -> Unique (cost=1497322.60..1501340.65 rows=267870 width=24) (actual
> time=153375.583..154829.461 rows=415874 loops=1)*
>
>

​Without running any DML I just got this result on the DISTINCT query...​

*​ -> Unique (cost=1519634.64..1520973.99 rows=200 width=48) (actual
time=161695.425..163174.422 rows=416075 loops=1)*

There is nothing in the SQL​ itself that would invoke an order dependency...

The query, the explain analyze on the first pass and the explain analyze on
the second pass.

*The second-level Hash Join combines/sees, in both cases, 7,993 and
1,275,138 records but depending on the LEFT/RIGHT order of the sub-nodes
appears to provide a different result.*

EXPLAIN ANALYZE
select count(*) from
(
SELECT DISTINCT
rrr
FROM public.rrr
INNER JOIN public.fff
ON fff.idx = rrr.fff_idx
INNER JOIN public.uuu
ON uuu.universe_hash = 5188205190738336870 AND
uuu.bababa = rrr.bababa
WHERE
fff.file_name_model_ver = '100' AND
fff.file_name_date BETWEEN '2005-03-01' AND '2005-07-30'
) ad

QUERY PLAN
Aggregate (cost=1504689.02..1504689.03 rows=1 width=0) (actual
time=155253.758..155253.759 rows=1 loops=1)
-> Unique (cost=1497322.60..1501340.65 rows=267870 width=24) *(actual
time=153375.583..154829.461 rows=415874 loops=1)*
-> Sort (cost=1497322.60..1497992.27 rows=267870 width=24)
(actual time=153375.578..153919.245 rows=415874 loops=1)
Sort Key: rrr.idx, rrr.row_number, rrr.bababa, rrr.data_date,
rrr.fff_idx
Sort Method: external merge Disk: 13784kB
-> Hash Join (cost=1456589.92..1467677.54 rows=267870
width=24) (actual time=151982.289..152665.099 rows=415874 loops=1)
Hash Cond: ((uuu.bababa)::text = (rrr.bababa)::text)
-> Index Only Scan using pk_uuu on uuu
(cost=0.42..270.01 rows=8091 width=8) (actual time=0.012..9.359 rows=*7993*
loops=1)
Index Cond: (universe_hash =
'5188205190738336870'::bigint)
Heap Fetches: 0
-> Hash (cost=1440572.59..1440572.59 rows=872393
width=24) (actual time=151981.919..151981.919 rows=*1275138* loops=1)
Buckets: 65536 (originally 65536) Batches: 32
(originally 16) Memory Usage: 3585kB
-> Hash Join (cost=819.47..1440572.59
rows=872393 width=24) (actual time=12037.549..150408.756 rows=1275138
loops=1)
Hash Cond: (rrr.fff_idx = fff.idx)
-> Seq Scan on rrr (cost=0.00..1164409.32
rows=71098632 width=24) (actual time=0.004..73673.708 rows=71098547 loops=1)
-> Hash (cost=807.58..807.58 rows=951
width=4) (actual time=2.360..2.360 rows=964 loops=1)
Buckets: 1024 Batches: 1 Memory
Usage: 42kB
-> Bitmap Heap Scan on fff
(cost=30.98..807.58 rows=951 width=4) (actual time=0.123..1.293 rows=964
loops=1)
Recheck Cond: ((file_name_date
>= '2005-03-01'::date) AND (file_name_date <= '2005-07-30'::date))
Filter:
((file_name_model_ver)::text = '100'::text)
Rows Removed by Filter: 540
Heap Blocks: exact=30
-> Bitmap Index Scan on
ix_fff_file_name_date (cost=0.00..30.74 rows=1445 width=0) (actual
time=0.111..0.111 rows=1504 loops=1)
Index Cond:
((file_name_date >= '2005-03-01'::date) AND (file_name_date <=
'2005-07-30'::date))
Planning time: 0.668 ms
Execution time: 155255.933 ms

QUERY PLAN
Aggregate (cost=1520976.49..1520976.50 rows=1 width=0) (actual
time=163603.573..163603.574 rows=1 loops=1)
-> Unique (cost=1519634.64..1520973.99 rows=200 width=48) *(actual
time=161695.425..163174.422 rows=416075 loops=1)*
-> Sort (cost=1519634.64..1520304.32 rows=267870 width=48)
(actual time=161695.393..162157.673 rows=416075 loops=1)
Sort Key: rrr.*
Sort Method: external sort Disk: 22376kB
-> Hash Join (cost=1190.62..1487242.09 rows=267870
width=48) (actual time=12556.229..158035.711 rows=416075 loops=1)
Hash Cond: ((rrr.bababa)::text = (uuu.bababa)::text)
-> Hash Join (cost=819.47..1440572.59 rows=872393
width=56) (actual time=12538.212..156077.489 rows=*1275138 *loops=1)
Hash Cond: (rrr.fff_idx = fff.idx)
-> Seq Scan on rrr (cost=0.00..1164409.32
rows=71098632 width=60) (actual time=0.008..82185.329 rows=71098547 loops=1)
-> Hash (cost=807.58..807.58 rows=951 width=4)
(actual time=2.351..2.351 rows=964 loops=1)
Buckets: 1024 Batches: 1 Memory Usage:
42kB
-> Bitmap Heap Scan on fff
(cost=30.98..807.58 rows=951 width=4) (actual time=0.097..1.307 rows=964
loops=1)
Recheck Cond: ((file_name_date >=
'2005-03-01'::date) AND (file_name_date <= '2005-07-30'::date))
Filter: ((file_name_model_ver)::text
= '100'::text)
Rows Removed by Filter: 540
Heap Blocks: exact=30
-> Bitmap Index Scan on
ix_fff_file_name_date (cost=0.00..30.74 rows=1445 width=0) (actual
time=0.086..0.086 rows=1504 loops=1)
Index Cond: ((file_name_date >=
'2005-03-01'::date) AND (file_name_date <= '2005-07-30'::date))
-> Hash (cost=270.01..270.01 rows=8091 width=8)
(actual time=17.233..17.233 rows=7993 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 377kB
-> Index Only Scan using pk_uuu on uuu
(cost=0.42..270.01 rows=8091 width=8) (actual time=0.012..8.525 rows=*7993
*loops=1)
Index Cond: (universe_hash =
'5188205190738336870'::bigint)
Heap Fetches: 0
Planning time: 0.642 ms
Execution time: 163606.571 ms

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message tarasbob 2016-02-04 22:13:23 BUG #13918: Simple query with Having clause returns incorrect results
Previous Message David G. Johnston 2016-02-04 21:58:18 Fwd: BUG #13908: Query returns too few rows