BUG #16627: union all with partioned table yields random aggregate results

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: kanaga(at)consumeracquisition(dot)com
Subject: BUG #16627: union all with partioned table yields random aggregate results
Date: 2020-09-22 15:10:51
Message-ID: 16627-1a2233d31af8d35b@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 16627
Logged by: Brian Kanaga
Email address: kanaga(at)consumeracquisition(dot)com
PostgreSQL version: 11.4
Operating system: Linux
Description:

Hoping this is something already discovered as the amount of data to
recreate is cumbersome.

Two tables identical in structure, one is a partioned table. ex, "active
ledger" and "historical ledger" where the partitions of the historical one
are by year.

When parallel processing is disabled, the issue goes away. The issue also
prefers to (always?) occur when nestloop is disabled. So my workaround is
to disable parallel (max_parallel_workers_per_gather=0) when this type of
query is run.

From my experimentation it seems to be related to having multiple workers
gathering data and to some extent their gathering overlaps. again, just in
a partitioned table. if i change the subquery to union all with the
partition itself (ex fb_ad_activity_daily_archive_2019) then the problem
does not occur.

As an aside, I have had to disable nestloop to get around what seems to be
planner inaccuracy and have not found adjusting statistics targets helpful.
However with nestloop disabled (and possibly without), the union all with
partitioned table causes random data.

Example of query structure and results:

select

obj.none, obj.mobileos ,

sum(spend) as spend,

sum(conversions) as conversions

from (

select fa.id as adset , case when fa.targeting_ios is not null and
fa.targeting_android is not null then 'iOS and Android' when
fa.targeting_ios is not null and fa.targeting_android is null then 'iOS'
when fa.targeting_ios is null and fa.targeting_android is not null then
'Android' when fa.targeting_ios is not null and fa.targeting_android is not
null then 'Any' end as mobileos , 1 as none

from fb_campaigns fc
left join fb_adsets fa on fc.id=fa.campaign and fa.account in
(128091690677840,205971526867688,183647115790295)

where
-- see if using account in join is ok
(fc.account is null or fc.account in
(128091690677840,205971526867688,183647115790295)) and fa.account in
(128091690677840,205971526867688,183647115790295)

-- master_rel_wheres_end

group by fa.id , case when fa.targeting_ios is not null and
fa.targeting_android is not null then 'iOS and Android' when
fa.targeting_ios is not null and fa.targeting_android is null then 'iOS'
when fa.targeting_ios is null and fa.targeting_android is not null then
'Android' when fa.targeting_ios is not null and fa.targeting_android is not
null then 'Any' end , 1

) obj full outer join (

select a.campaign_id as adset ,

-- these columns will be missing in some tbls
avg(cast(relevance_score as float4)) as relevance_score,
avg(cast(positive_feedback as float4)) as positive_feedback,
avg(cast(negative_feedback as float4)) as negative_feedback,

sum(spend) as spend,
sum(case when 1=0 then 0 else
coalesce(unique_actions_1d_view_app_custom_event_fb_mobile_purchase,0)
end+case when 28=0 then 0 else
coalesce(unique_actions_28d_click_app_custom_event_fb_mobile_purchase,0)
end) as mobile_purchases
,
sum(case when 1=0 then 0 else
coalesce(unique_actions_1d_view_mobile_app_install,0) end+case when 28=0
then 0 else coalesce(unique_actions_28d_click_mobile_app_install,0) end) as
mobile_app_install
,
sum(case when 1=0 then 0 else
coalesce(action_values_1d_view_app_custom_event_fb_mobile_purchase,0)
end+case when 28=0 then 0 else
coalesce(action_values_28d_click_app_custom_event_fb_mobile_purchase,0) end)
as action_value_app_custom_event_fb_mobile_purchase
,
sum(case when 1=0 then 0 else
coalesce(unique_actions_1d_view_link_click,0) end+case when 28=0 then 0 else
coalesce(unique_actions_28d_click_link_click,0) end) as link_clicks
,sum(event_1) as event_1,sum(event_2) as event_2,sum(event_3) as
event_3,sum(event_4) as event_4,sum(impressions) as impressions,
sum(case when 1=0 then 0 else
coalesce(actions_1d_view_app_custom_event_fb_mobile_purchase,0) end+case
when 28=0 then 0 else
coalesce(actions_28d_click_app_custom_event_fb_mobile_purchase,0) end) as
nu_mobile_purchases
,sum(event_6) as event_6,sum(event_8) as event_8,
sum(case when 1=0 then 0 else
coalesce(actions_1d_view_mobile_app_install,0) end+case when 28=0 then 0
else coalesce(actions_28d_click_mobile_app_install,0) end) as
nu_mobile_app_install
,
sum(case when c.buying_type = 10 then actions_1d_view_mobile_app_install
end) as conversions --< - this
,
sum(case when 1=0 then 0 else coalesce(actions_1d_view_link_click,0)
end+case when 28=0 then 0 else coalesce(actions_28d_click_link_click,0) end)
as nu_link_clicks

from
(
select * from fb_ad_activity_daily where logdate between '11/01/2019'
and '11/17/2019' and account_id in
(128091690677840,205971526867688,183647115790295)
union all
select * from fb_ad_activity_daily_archive where logdate between
'11/01/2019' and '11/17/2019' and account_id in
(128091690677840,205971526867688,183647115790295) --< plus this
)
a
inner join fb_campaigns c on a.campaign_group_id=c.id


left join ca_ud_conversions ud on a.logdate=ud.logdate and
a.account_id=ud.account and a.adgroup_id=ud.adgroup

where a.logdate between '11/01/2019' and '11/17/2019' and a.account_id in
(128091690677840,205971526867688,183647115790295)

group by a.campaign_id

) stats on obj.adset = stats.adset

group by obj.none, obj.mobileos

The outputs of the sums will be random; results will vary. sometimes they
will be correct.

correct results:
1 Android 66050.68
1 iOS 27624.41
1

examples of incorrect random results
1 Android 111098.92
1 iOS 34314.69
1
1 Android 116473.33
1 iOS 37640.63
1
1 Android 68350.25
1 iOS 27624.41
1

-- issue does occur
set enable_nestloop TO 0;
GroupAggregate (cost=356113.02..356115.54 rows=1 width=76) (actual
time=1193.334..1194.411 rows=3 loops=1)
Group Key: (1), (CASE WHEN ((fa.targeting_ios IS NOT NULL) AND
(fa.targeting_android IS NOT NULL)) THEN 'iOS and Android'::text WHEN
((fa.targeting_ios IS NOT NULL) AND (fa.targeting_android IS NULL)) THEN
'iOS'::text WHEN ((fa.targeting_ios IS NULL) AND (fa.targeting_android IS
NOT NULL)) THEN 'Android'::text WHEN ((fa.targeting_ios IS NOT NULL) AND
(fa.targeting_android IS NOT NULL)) THEN 'Any'::text ELSE NULL::text END)
-> Sort (cost=356113.02..356113.52 rows=200 width=52) (actual
time=1192.245..1193.084 rows=4280 loops=1)
Sort Key: (1), (CASE WHEN ((fa.targeting_ios IS NOT NULL) AND
(fa.targeting_android IS NOT NULL)) THEN 'iOS and Android'::text WHEN
((fa.targeting_ios IS NOT NULL) AND (fa.targeting_android IS NULL)) THEN
'iOS'::text WHEN ((fa.targeting_ios IS NULL) AND (fa.targeting_android IS
NOT NULL)) THEN 'Android'::text WHEN ((fa.targeting_ios IS NOT NULL) AND
(fa.targeting_android IS NOT NULL)) THEN 'Any'::text ELSE NULL::text END)
Sort Method: quicksort Memory: 395kB
-> Merge Full Join (cost=356063.98..356105.38 rows=200 width=52)
(actual time=1183.640..1190.639 rows=4280 loops=1)
Merge Cond: (fa.id = "*SELECT* 1".campaign_id)
-> Group (cost=179905.47..179905.84 rows=1 width=44) (actual
time=85.440..90.013 rows=4280 loops=1)
Group Key: fa.id, (CASE WHEN ((fa.targeting_ios IS NOT
NULL) AND (fa.targeting_android IS NOT NULL)) THEN 'iOS and Android'::text
WHEN ((fa.targeting_ios IS NOT NULL) AND (fa.targeting_android IS NULL))
THEN 'iOS'::text WHEN ((fa.targeting_ios IS NULL) AND (fa.targeting_android
IS NOT NULL)) THEN 'Android'::text WHEN ((fa.targeting_ios IS NOT NULL) AND
(fa.targeting_android IS NOT NULL)) THEN 'Any'::text ELSE NULL::text END)
-> Gather Merge (cost=179905.47..179905.83 rows=3
width=40) (actual time=85.438..87.984 rows=4280 loops=1)
Workers Planned: 3
Workers Launched: 3
-> Group (cost=178905.43..178905.44 rows=1
width=40) (actual time=79.332..80.143 rows=1070 loops=4)
Group Key: fa.id, (CASE WHEN
((fa.targeting_ios IS NOT NULL) AND (fa.targeting_android IS NOT NULL)) THEN
'iOS and Android'::text WHEN ((fa.targeting_ios IS NOT NULL) AND
(fa.targeting_android IS NULL)) THEN 'iOS'::text WHEN ((fa.targeting_ios IS
NULL) AND (fa.targeting_android IS NOT NULL)) THEN 'Android'::text WHEN
((fa.targeting_ios IS NOT NULL) AND (fa.targeting_android IS NOT NULL)) THEN
'Any'::text ELSE NULL::text END)
-> Sort (cost=178905.43..178905.43 rows=1
width=40) (actual time=79.329..79.567 rows=1070 loops=4)
Sort Key: fa.id, (CASE WHEN
((fa.targeting_ios IS NOT NULL) AND (fa.targeting_android IS NOT NULL)) THEN
'iOS and Android'::text WHEN ((fa.targeting_ios IS NOT NULL) AND
(fa.targeting_android IS NULL)) THEN 'iOS'::text WHEN ((fa.targeting_ios IS
NULL) AND (fa.targeting_android IS NOT NULL)) THEN 'Android'::text WHEN
((fa.targeting_ios IS NOT NULL) AND (fa.targeting_android IS NOT NULL)) THEN
'Any'::text ELSE NULL::text END)
Sort Method: quicksort Memory: 97kB
Worker 0: Sort Method: quicksort
Memory: 72kB
Worker 1: Sort Method: quicksort
Memory: 99kB
Worker 2: Sort Method: quicksort
Memory: 103kB
-> Parallel Hash Join
(cost=167790.90..178905.42 rows=1 width=40) (actual time=70.443..78.798
rows=1070 loops=4)
Hash Cond: (fc.id =
fa.campaign)
-> Parallel Seq Scan on
fb_campaigns fc (cost=0.00..11113.68 rows=74 width=8) (actual
time=0.090..7.951 rows=174 loops=4)
Filter: ((account IS NULL)
OR (account = ANY
('{128091690677840,205971526867688,183647115790295}'::bigint[])))
Rows Removed by Filter:
49408
-> Parallel Hash
(cost=167790.81..167790.81 rows=7 width=64) (actual time=70.253..70.253
rows=1070 loops=4)
Buckets: 4096 (originally
1024) Batches: 1 (originally 1) Memory Usage: 440kB
-> Parallel Seq Scan on
fb_adsets fa (cost=0.00..167790.81 rows=7 width=64) (actual
time=0.988..55.405 rows=1070 loops=4)
Filter: ((account =
ANY ('{128091690677840,205971526867688,183647115790295}'::bigint[])) AND
(account = ANY
('{128091690677840,205971526867688,183647115790295}'::bigint[])))
Rows Removed by
Filter: 198077
-> Finalize GroupAggregate (cost=176158.52..176197.01
rows=200 width=160) (actual time=1098.196..1098.694 rows=64 loops=1)
Group Key: "*SELECT* 1".campaign_id
-> Gather Merge (cost=176158.52..176192.97 rows=272
width=24) (actual time=1098.160..1141.270 rows=181 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial GroupAggregate
(cost=175158.49..175161.55 rows=136 width=24) (actual
time=1088.396..1088.791 rows=60 loops=3)
Group Key: "*SELECT* 1".campaign_id
-> Sort (cost=175158.49..175158.83
rows=136 width=22) (actual time=1088.377..1088.547 rows=675 loops=3)
Sort Key: "*SELECT* 1".campaign_id
Sort Method: quicksort Memory: 73kB
Worker 0: Sort Method: quicksort
Memory: 80kB
Worker 1: Sort Method: quicksort
Memory: 79kB
-> Parallel Hash Left Join
(cost=90647.99..175153.67 rows=136 width=22) (actual time=963.744..1088.079
rows=675 loops=3)
Hash Cond: (("*SELECT*
1".logdate = ud.logdate) AND ("*SELECT* 1".account_id = ud.account) AND
("*SELECT* 1".adgroup_id = ud.adgroup))
-> Parallel Hash Join
(cost=11673.89..91194.54 rows=136 width=42) (actual time=64.276..76.694
rows=675 loops=3)
Hash Cond: ("*SELECT*
1".campaign_group_id = c.id)
-> Parallel Append
(cost=0.42..79520.71 rows=140 width=48) (actual time=6.548..18.083 rows=675
loops=3)
-> Subquery Scan on
"*SELECT* 1" (cost=0.42..2.67 rows=1 width=48) (actual time=0.016..0.016
rows=0 loops=1)
-> Index Scan
using fb_ad_activity_daily2_pkey on fb_ad_activity_daily (cost=0.42..2.66
rows=1 width=1618) (actual time=0.015..0.015 rows=0 loops=1)
Index
Cond: ((logdate >= '2019-11-01'::date) AND (logdate <= '2019-11-17'::date)
AND (logdate >= '2019-11-01'::date) AND (logdate <= '2019-11-17'::date))
Filter:
((account_id = ANY
('{128091690677840,205971526867688,183647115790295}'::bigint[])) AND
(account_id = ANY
('{128091690677840,205971526867688,183647115790295}'::bigint[])))
-> Subquery Scan on
"*SELECT* 2" (cost=0.56..79516.88 rows=326 width=48) (actual
time=6.541..17.792 rows=675 loops=3)
-> Parallel
Append (cost=0.56..79513.62 rows=192 width=1618) (actual time=6.539..17.471
rows=675 loops=3)
->
Parallel Index Scan using fb_ad_activity_daily_archive_2019_pkey on
fb_ad_activity_daily_archive_2019 (cost=0.56..79512.66 rows=192 width=1618)
(actual time=6.538..17.169 rows=675 loops=3)

Index Cond: ((logdate >= '2019-11-01'::date) AND (logdate <=
'2019-11-17'::date) AND (logdate >= '2019-11-01'::date) AND (logdate <=
'2019-11-17'::date) AND (account_id = ANY
('{128091690677840,205971526867688,183647115790295}'::bigint[])) AND
(account_id = ANY
('{128091690677840,205971526867688,183647115790295}'::bigint[])))
-> Parallel Hash
(cost=10873.76..10873.76 rows=63976 width=10) (actual time=57.291..57.292
rows=66109 loops=3)
Buckets: 262144
Batches: 1 Memory Usage: 11424kB
-> Parallel Seq
Scan on fb_campaigns c (cost=0.00..10873.76 rows=63976 width=10) (actual
time=0.008..28.978 rows=66109 loops=3)
-> Parallel Hash
(cost=60953.22..60953.22 rows=771422 width=20) (actual time=875.479..875.479
rows=1031137 loops=3)
Buckets: 131072 Batches:
32 Memory Usage: 6432kB
-> Parallel Seq Scan on
ca_ud_conversions ud (cost=0.00..60953.22 rows=771422 width=20) (actual
time=0.004..405.186 rows=1031137 loops=3)
Planning Time: 3.630 ms
Execution Time: 1238.495 ms

-- issue does not occur
SET max_parallel_workers_per_gather TO 0;
set enable_nestloop TO 0;
GroupAggregate (cost=466729.81..466732.32 rows=1 width=76) (actual
time=2866.611..2867.687 rows=3 loops=1)
Group Key: (1), (CASE WHEN ((fa.targeting_ios IS NOT NULL) AND
(fa.targeting_android IS NOT NULL)) THEN 'iOS and Android'::text WHEN
((fa.targeting_ios IS NOT NULL) AND (fa.targeting_android IS NULL)) THEN
'iOS'::text WHEN ((fa.targeting_ios IS NULL) AND (fa.targeting_android IS
NOT NULL)) THEN 'Android'::text WHEN ((fa.targeting_ios IS NOT NULL) AND
(fa.targeting_android IS NOT NULL)) THEN 'Any'::text ELSE NULL::text END)
-> Sort (cost=466729.81..466730.31 rows=200 width=52) (actual
time=2865.528..2866.368 rows=4280 loops=1)
Sort Key: (1), (CASE WHEN ((fa.targeting_ios IS NOT NULL) AND
(fa.targeting_android IS NOT NULL)) THEN 'iOS and Android'::text WHEN
((fa.targeting_ios IS NOT NULL) AND (fa.targeting_android IS NULL)) THEN
'iOS'::text WHEN ((fa.targeting_ios IS NULL) AND (fa.targeting_android IS
NOT NULL)) THEN 'Android'::text WHEN ((fa.targeting_ios IS NOT NULL) AND
(fa.targeting_android IS NOT NULL)) THEN 'Any'::text ELSE NULL::text END)
Sort Method: quicksort Memory: 395kB
-> Merge Full Join (cost=466713.55..466722.17 rows=200 width=52)
(actual time=2857.653..2863.873 rows=4280 loops=1)
Merge Cond: (fa.id = "*SELECT* 1".campaign_id)
-> Group (cost=191854.00..191854.01 rows=1 width=44) (actual
time=246.292..249.519 rows=4280 loops=1)
Group Key: fa.id, (CASE WHEN ((fa.targeting_ios IS NOT
NULL) AND (fa.targeting_android IS NOT NULL)) THEN 'iOS and Android'::text
WHEN ((fa.targeting_ios IS NOT NULL) AND (fa.targeting_android IS NULL))
THEN 'iOS'::text WHEN ((fa.targeting_ios IS NULL) AND (fa.targeting_android
IS NOT NULL)) THEN 'Android'::text WHEN ((fa.targeting_ios IS NOT NULL) AND
(fa.targeting_android IS NOT NULL)) THEN 'Any'::text ELSE NULL::text END)
-> Sort (cost=191854.00..191854.00 rows=1 width=40)
(actual time=246.288..247.277 rows=4280 loops=1)
Sort Key: fa.id, (CASE WHEN ((fa.targeting_ios IS
NOT NULL) AND (fa.targeting_android IS NOT NULL)) THEN 'iOS and
Android'::text WHEN ((fa.targeting_ios IS NOT NULL) AND
(fa.targeting_android IS NULL)) THEN 'iOS'::text WHEN ((fa.targeting_ios IS
NULL) AND (fa.targeting_android IS NOT NULL)) THEN 'Android'::text WHEN
((fa.targeting_ios IS NOT NULL) AND (fa.targeting_android IS NOT NULL)) THEN
'Any'::text ELSE NULL::text END)
Sort Method: quicksort Memory: 393kB
-> Hash Join (cost=12963.85..191853.99 rows=1
width=40) (actual time=31.233..244.131 rows=4280 loops=1)
Hash Cond: (fa.campaign = fc.id)
-> Seq Scan on fb_adsets fa
(cost=0.00..178890.05 rows=34 width=64) (actual time=0.109..210.360
rows=4280 loops=1)
Filter: ((account = ANY
('{128091690677840,205971526867688,183647115790295}'::bigint[])) AND
(account = ANY
('{128091690677840,205971526867688,183647115790295}'::bigint[])))
Rows Removed by Filter: 792308
-> Hash (cost=12961.00..12961.00 rows=228
width=8) (actual time=31.110..31.111 rows=694 loops=1)
Buckets: 1024 Batches: 1 Memory
Usage: 36kB
-> Seq Scan on fb_campaigns fc
(cost=0.00..12961.00 rows=228 width=8) (actual time=0.013..30.894 rows=694
loops=1)
Filter: ((account IS NULL) OR
(account = ANY
('{128091690677840,205971526867688,183647115790295}'::bigint[])))
Rows Removed by Filter: 197633
-> GroupAggregate (cost=274859.55..274865.64 rows=200
width=160) (actual time=2611.356..2612.484 rows=64 loops=1)
Group Key: "*SELECT* 1".campaign_id
-> Sort (cost=274859.55..274860.37 rows=327 width=22)
(actual time=2611.327..2611.831 rows=2026 loops=1)
Sort Key: "*SELECT* 1".campaign_id
Sort Method: quicksort Memory: 207kB
-> Hash Left Join (cost=171842.20..274845.89
rows=327 width=22) (actual time=2334.961..2610.504 rows=2026 loops=1)
Hash Cond: (("*SELECT* 1".logdate =
ud.logdate) AND ("*SELECT* 1".account_id = ud.account) AND ("*SELECT*
1".adgroup_id = ud.adgroup))
-> Hash Join (cost=15665.78..96165.51
rows=327 width=42) (actual time=156.656..198.283 rows=2026 loops=1)
Hash Cond: ("*SELECT*
1".campaign_group_id = c.id)
-> Append (cost=0.42..79524.29
rows=327 width=48) (actual time=13.125..45.025 rows=2026 loops=1)
-> Subquery Scan on "*SELECT*
1" (cost=0.42..2.67 rows=1 width=48) (actual time=0.012..0.012 rows=0
loops=1)
-> Index Scan using
fb_ad_activity_daily2_pkey on fb_ad_activity_daily (cost=0.42..2.66 rows=1
width=1618) (actual time=0.011..0.011 rows=0 loops=1)
Index Cond:
((logdate >= '2019-11-01'::date) AND (logdate <= '2019-11-17'::date) AND
(logdate >= '2019-11-01'::date) AND (logdate <= '2019-11-17'::date))
Filter: ((account_id
= ANY ('{128091690677840,205971526867688,183647115790295}'::bigint[])) AND
(account_id = ANY
('{128091690677840,205971526867688,183647115790295}'::bigint[])))
-> Subquery Scan on "*SELECT*
2" (cost=0.56..79519.99 rows=326 width=48) (actual time=13.111..44.245
rows=2026 loops=1)
-> Append
(cost=0.56..79516.73 rows=326 width=1618) (actual time=13.110..43.372
rows=2026 loops=1)
-> Index Scan using
fb_ad_activity_daily_archive_2019_pkey on fb_ad_activity_daily_archive_2019
(cost=0.56..79515.10 rows=326 width=1618) (actual time=13.108..42.569
rows=2026 loops=1)
Index Cond:
((logdate >= '2019-11-01'::date) AND (logdate <= '2019-11-17'::date) AND
(logdate >= '2019-11-01'::date) AND (logdate <= '2019-11-17'::date) AND
(account_id = ANY
('{128091690677840,205971526867688,183647115790295}'::bigint[])) AND
(account_id = ANY
('{128091690677840,205971526867688,183647115790295}'::bigint[])))
-> Hash (cost=12217.27..12217.27
rows=198327 width=10) (actual time=143.122..143.122 rows=198327 loops=1)
Buckets: 262144 Batches: 2
Memory Usage: 6693kB
-> Seq Scan on fb_campaigns c
(cost=0.00..12217.27 rows=198327 width=10) (actual time=0.007..74.000
rows=198327 loops=1)
-> Hash (cost=84095.88..84095.88
rows=3085688 width=20) (actual time=2177.119..2177.119 rows=3093410
loops=1)
Buckets: 131072 Batches: 32 Memory
Usage: 6348kB
-> Seq Scan on ca_ud_conversions ud
(cost=0.00..84095.88 rows=3085688 width=20) (actual time=0.005..1005.261
rows=3093410 loops=1)
Planning Time: 3.733 ms
Execution Time: 2868.068 ms

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2020-09-22 15:53:37 Re: BUG #16627: union all with partioned table yields random aggregate results
Previous Message İsmet BALAT 2020-09-22 13:07:34 Re: Postgresql Bug Report