From: | James Coleman <jtc331(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
Cc: | Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: enable_incremental_sort changes query behavior |
Date: | 2020-10-05 15:33:10 |
Message-ID: | CAAaqYe8_Y7Ydfwdr_q0NWCpr+E_Hc_e=ScEhi9MUmuLT45cNDA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Oct 4, 2020 at 9:40 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
>
> On Sat, Oct 3, 2020 at 10:10 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
> >
> > On Sat, Oct 3, 2020 at 5:44 PM Tomas Vondra
> > <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> > >
> > > On Sat, Oct 03, 2020 at 10:50:06AM -0400, James Coleman wrote:
> > > >On Fri, Oct 2, 2020 at 11:16 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
> > > >>
> > > >> On Fri, Oct 2, 2020 at 7:07 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
> > > >> >
> > > >> > On Fri, Oct 2, 2020 at 6:28 PM Tomas Vondra
> > > >> > <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> > > >> > >
> > > >> > > On Fri, Oct 02, 2020 at 05:45:52PM -0400, James Coleman wrote:
> > > >> > > >On Fri, Oct 2, 2020 at 4:56 PM Tomas Vondra
> > > >> > > ><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> > > >> > > >>
> > > >> > > >> ...
> > > >> > > >>
> > > >> > > >> More importanly, it does not actually fix the issue - it does fix that
> > > >> > > >> particular query, but just replacing the DISTINCT with either ORDER BY
> > > >> > > >> or GROUP BY make it fail again :-(
> > > >> > > >>
> > > >> > > >> Attached is a simple script I used, demonstrating these issues for the
> > > >> > > >> three cases that expect to have ressortgroupref != 0 (per the comment
> > > >> > > >> before TargetEntry in plannodes.h).
> > > >> > > >
> > > >> > > >So does checking for volatile expressions (if you happened to test
> > > >> > > >that) solve all the cases? If you haven't tested that yet, I can try
> > > >> > > >to do that this evening.
> > > >> > > >
> > > >> > >
> > > >> > > Yes, it does fix all the three queries in the SQL script.
> > > >> > >
> > > >> > > The question however is whether this is the root issue, and whether it's
> > > >> > > the right way to fix it. For example - volatility is not the only reason
> > > >> > > that may block qual pushdown. If you look at qual_is_pushdown_safe, it
> > > >> > > also blocks pushdown of leaky functions in security_barrier views. So I
> > > >> > > wonder if that could cause failures too, somehow. But I haven't managed
> > > >> > > to create such example.
> > > >> >
> > > >> > I was about to say that the issue here is slightly different from qual
> > > >> > etc. pushdown, since we're not concerned about quals here, and so I
> > > >> > wonder where we determine what target list entries to put in a given
> > > >> > scan path, but then I realized that implies (maybe!) a simpler
> > > >> > solution. Instead of duplicating checks on target list entries would
> > > >> > be safe, why not check directly in get_useful_pathkeys_for_relation()
> > > >> > whether or not the pathkey has a target list entry?
> > > >> >
> > > >> > I haven't been able to try that out yet, and so maybe I'm missing
> > > >> > something, but I need to step out for a bit, so I'll have to look at
> > > >> > it later.
> > > >>
> > > >> I've started poking at this, but haven't yet found a workable
> > > >> solution. See the attached patch which "solves" the problem by
> > > >> breaking putting the sort under the gather merge, but it at least
> > > >> demonstrates conceptually what I think we're interested in doing.
> > > >>
> > > >> The issue currently is that the comparison of expressions fails -- in
> > > >> our query, the first column selected shows up as a Var (with the same
> > > >> contents) but is a different pointer in the em_expr and the reltarget
> > > >> exprs list. I don't yet see a good way to get the equivalence class
> > > >> for a PathTarget entry.
> > > >
> > > >Hmm, I think I was looking to do is the attached patch. I didn't
> > > >realize until I did a lot of reading through source that we have an
> > > >equal() function that can compare exprs. That (plus the realization in
> > > >[1] the originally reported backtrace wasn't where the error actually
> > > >came from) convinced me that what we need is to confirm not only that
> > > >the all of the vars in the ec member come from the relids in the rel
> > > >but also that the expr is actually represented in the target of the
> > > >rel.
> > > >
> > > >With the gucs changed as I mentioned earlier both of the plans (with
> > > >and without a volatile call in the 2nd select entry) now look like:
> > > >
> > > > Unique
> > > > -> Gather Merge
> > > > Workers Planned: 2
> > > > -> Sort
> > > > Sort Key: ref_0.i, (md5(ref_0.t))
> > > > -> Nested Loop
> > > > -> Parallel Index Scan using ref_0_i_idx on ref_0
> > > > -> Function Scan on generate_series ref_1
> > > >
> > > >Without the gucs changed the minimal repro case now doesn't error, but
> > > >results in this plan:
> > > >
> > > > HashAggregate
> > > > Group Key: ref_0.i, CASE WHEN pg_rotate_logfile_old() THEN ref_0.t
> > > >ELSE ref_0.t END
> > > > -> Nested Loop
> > > > -> Seq Scan on ref_0
> > > > -> Function Scan on generate_series ref_1
> > > >
> > > >Similarly in your six queries I now only see parallel query showing up
> > > >in the last one.
> > > >
> > >
> > > OK, that seems reasonable I think.
> >
> > If we proceed with this patch I'd like to tweak it to check for the
> > relids subset first on the theory that it will be cheaper than the
> > expr equality check loop; that change is attached.
> >
> > > >I created an entirely new function because adding the target expr
> > > >lookup to the existing find_em_expr_for_rel() function broke a bunch
> > > >of postgres_fdw tests. That maybe raises questions about whether that
> > > >code also could have problems in theory/in the future, but I didn't
> > > >investigate further. In any case we already know it excludes
> > > >volatile...so maybe it's fine because in practice that's actually a
> > > >broader exclusion than what we're doing here.
> > > >
> > >
> > > I don't think postgres_fdw needs these new checks, because FDW scan's
> > > are not allowed in parallel part of the plan - if that changes in the
> > > future, I'm sure that'll require fixes in plenty other places.
> > >
> > > OTOH it's interesting that it triggers those failures - I wonder if we
> > > could learn something from them? AFAICS those paths can't be built by
> > > generate_useful_gather_paths (because of the postgres_fdw vs. parallel
> > > query restrictions), so how do these plans look?
> >
> > Well the fdw code doesn't trigger the errors, but both
> > generate_useful_gather_paths() and the fdw code originally relied on
> > find_em_expr_for_rel(), but now they're diverging. IIRC the fdw code
> > also uses it for determining what path keys are valid -- in that case
> > which ones are safe to be sent to the foreign server (so all of the
> > vars have to be from rels on the foreign server). The fdw code has
> > additional checks too though, for example no volatile expressions may
> > be pushed to the remote.
> >
> > > >This seems to fix the issue, but I'd like feedback on whether it's too
> > > >strict. We could of course just check em_has_volatile, but I'm
> > > >wondering if that's simultaneously too strict (by not allowing the
> > > >volatile expression to be computed in the gather merge supposing
> > > >there's no join) and too loose (maybe there are other cases we should
> > > >care about?). It also just strikes me as re-encoding rules that should
> > > >have already been applied (and thus we should be able to look up in
> > > >the data we have if it's safe to use the expr/pathkey). Those are
> > > >mostly intuitions though.
> > > >
> > >
> > > I don't know :-( As I mentioned before, I suspect checking just the
> > > volatility may not be enough in some cases (leakyness, etc.) and I think
> > > you're right it may be too strict in other cases.
> > >
> > > Not sure I understand which rules you think we're re-encoding, but I
> > > have a nagging feeling there's a piece of code somewhere earlier in
> > > the query planner meant to prevent such cases (sort on path without all
> > > the pathkeys), and that fixing it here is just a band aid :-(
> >
> > I'm getting at exactly what you're saying below: there's got to be
> > some existing rules (and presumably code already enforcing it in some
> > places) at what level in the path tree a given pathkey is
> > safe/correct, and we don't want to reinvent those rules (or ideally
> > that code). Verifying the expr is in the rel seems to me to be
> > intuitively correct, but I wish there was another place in the code
> > that could validate that.
> >
> > > I'm sure we're building plans with Sort on top of Index Scan paths, so
> > > how come those don't fail? How come the non-parallel version of these
> > > queries don't fail?
> >
> > Yeah, exactly. Something has to be doing something similar -- I don't
> > think everywhere else using sort_pathkeys instead of query_pathkeys
> > really changes the problem. I've read through all of the places we use
> > generate_useful_gather_paths() as well as root->sort_pathkeys to try
> > to get a better understanding of this. Most of the places I convinced
> > myself it's already safe -- for example places like
> > create_orderd_paths are working with the upper rel, and the full
> > target list is available. A few other places I added comments (see the
> > "questions" patch in the attached series) where I'm not sure why we
> > apply projections *after* we create a sort path; given the issue we're
> > discussing here I would have thought you'd want to do exactly the
> > opposite.
> >
> > I haven't yet tried to read through the code in other places where we
> > build an explicit sort to see if I can find any hints as to what
> > existing code does to ensure this situation doesn't occur, but so far
> > I haven't found anything else obvious.
>
> I did a lot more reading through code, and I concluded that generally
> (maybe exclusively?) create_sort_path() is only called on upper rels
> (e.g., from create_ordered_paths()), so we already have the full
> target list available. I also stepped through in the debugger and
> confirmed that non-var exprs generally don't seem to show up in the
> pathtarget of paths either on base rels or on join rels. Finally, the
> last piece of relevant data is that we don't actually push down sorts
> on non-parallel paths even when it would be beneficial. For example,
> this query:
>
> select i, md5(i::text) from t2, generate_series(1,2) order by i;
>
> generates this plan:
>
> Sort
> Sort Key: t2.i
> -> Nested Loop
> -> Function Scan on generate_series
> -> Materialize
> -> Seq Scan on t2
>
> but it seems obvious to me that it would be better to move the sort to
> immediately above the seq scan.
>
> So I conclude from all of this that the real reason we didn't see this
> problem before was that we didn't have anything that was trying to
> generate useful sort paths lower in the query, so it just worked. To
> state it from the other direction, I don't see any indication that
> there actually is any code trying to determine what's safe to put in
> the target at various join levels; from what I can tell only vars are
> distributed at those points. Thinking about it some more, this
> actually makes sense, since get_useful_pathkeys() is new code, and
> something like it would have been needed already if we wanted to push
> sorts down further into the plan.
>
> As a bit of a side note: combined with the above idea about pushing
> sort down, it could also be beneficial to try to distribute
> expressions lower too.
>
> So I think the previously attached patch is probably correct; the
> second patch with the comment questions shows I still don't fully
> understand how/when projections are applied (and when they're needed),
> though it might have something to do with what I said about only
> operating on upper rels.
Just FYI in case the patch seems mergeable; I'm adding test cases now,
and I think I might make some minor tweaks to the patch.
I hope to have a new version out ~this evening.
James
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2020-10-05 15:45:15 | Re: partition routing layering in nodeModifyTable.c |
Previous Message | Matthias Apitz | 2020-10-05 14:57:22 | Re: dup(0) fails on Ubuntu 20.04 and macOS 10.15 with 13.0 |