From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Brian Kanaga <kanaga(at)consumeracquisition(dot)com> |
Cc: | PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #16627: union all with partioned table yields random aggregate results |
Date: | 2020-09-24 09:43:54 |
Message-ID: | CAApHDvpm+YbPqCFQESE9h7Jx8-XX0vrJtjEySE8t-D8HHZW11A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hi Brian,
On Thu, 24 Sep 2020 at 01:50, Brian Kanaga
<kanaga(at)consumeracquisition(dot)com> wrote:
> Attached is a much-simplified version of the problem query along with screen
> shots of plans and what tweaks to the query produce changes to the plan.
Thanks for sending those.
(For the future, attaching a text file with the queries and explain
output would be much easier to work with. I personally tend to compare
these sorts of things in a text compare tool. Playing spot the
difference with images is more tricky.)
Comparing the images you attached it does look like the index scan on
fb_add_daily_archive_2019_pkey found more rows in the parallel version
of the scan. 262 * 4 = 1048, but only 826 on the "noissue.png" plan.
In the cases you've shown that were incorrect, the aggregated value is
larger. So assuming you're always aggregating positive values then the
incorrect result does hint that something is getting more rows than it
should. The row counts I see indicate that's the case with ~1048 in
the error case and only 826 in the correct result case. It would be
good to get the full text of the EXPLAIN ANALYZE to confirm those
predicates match properly. That part was chopped off the screenshot.
I noticed that the "issue.png" plan has a nested Parallel Append, the
outer of which has a mix of parallel and parallel safe paths. I'm not
sure how relevant that is, but having nested parallel appends is
probably not that common.
I played around with the following trying to produce a similar plan
with a nested parallel append with a mix of parallel and parallel safe
paths. Trying this on 11.4 I didn't see any executions with the
incorrect tuple count.
drop table t;
drop table pt;
create table pt (a int) partition by range(a);
create table pt1 partition of pt for values from (0) to (3000000);
create index on pt1 (a);
create table t (a int primary key);
insert into t select x from generate_Series(1,2000000)x;
insert into pt select x from generate_series(1,2000000)x;
alter table t set (parallel_workers=0);
set enable_bitmapscan=0;
set enable_indexonlyscan=0;
set work_mem = '200MB';
select count(*) from (select * from t where a between 100000 and
200000 union all select * from t where a between 200000 and 300000
union all select * from pt where a between 900000 and 999999) t;
> I have tried to recreate this for you in a dump file but I could not get it
> to happen without including gobs of data. Even tweaking the plan I could
> not get the filter part evaluating to match the offending plan.
How large is the dataset? and if the data was properly anonymised,
and the size wasn't too insane, would you be allowed to share it?
privately would be an option.
David
From | Date | Subject | |
---|---|---|---|
Next Message | Etsuro Fujita | 2020-09-24 10:35:08 | Re: mysql-fdw fail |
Previous Message | Daniel Gustafsson | 2020-09-24 09:39:18 | Re: BUG #16632: Not a bug, just a typo |