Re: [Proposal] Level4 Warnings show many shadow vars

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Ranier Vilela <ranier_gyn(at)hotmail(dot)com>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [Proposal] Level4 Warnings show many shadow vars
Date: 2019-12-11 18:57:31
Message-ID: 20191211185731.GY6962@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Ranier Vilela (ranier_gyn(at)hotmail(dot)com) wrote:
> De: Stephen Frost
> Enviadas: Quarta-feira, 11 de Dezembro de 2019 15:34
>
> >I agree with not breaking things but that doesn't mean the only
> >reasonable approach is to do the absolute minimum- you might not be
> >breaking something today, but it's going to confuse people later on down
> >the road and may lead to bugs being introduced due to that confusion, or
> >at the very least will add to people's time to figure out what's really
> >going on.
> I don't know how such fixes could lead to more bugs.
> Currently there is a risk of having bugs by mixing access to shadow variables with macros.

I really don't have any doubts that it's going to lead to confusion,
particularly in a case like the numTables vs. nTables thing you're
proposing in the one case that I spent some time looking at, and that
confusion certainly could lead to bugs. Sure, having shadow variables
also could- but you haven't identified an actual bug there today, so why
not just fix it in a way that eliminates the confusion here?

Here's an example of my concern- we change the name of the numTables
variable in these pg_dump functions to nTables as you propose... And
then later on someone starts hacking on these functions and they know
about the global and they start using it, so now we've got two
variables, both able to be used in the same scope, but one of them is a
global and the other is a local. As long as both are always the same,
sure everything works- but what happens if, for whatever reason, someone
uses the function in a new way and passes in a different value as an
argument, one that doesn't match what the global has? Well, some of the
code will use the argument, and some of the code won't. At least today,
there's no chance that the global variable will be used inside that
function- it's *always* going to use the argument passed in.

I don't think that's even that far-fetched of a possibility considering
most of the code is using the global variable directly and these
functions are really the odd ones where numTables is being passed in as
an argument, so ending up with a mix in the function looks rather likely
to happen, and a bug resulting from that inconsistency entirely
possible.

It's also possbile that the changes you're proposing might themselves
induce bugs- by keeping the variable and just renaming it, you had
better be ABSOLUTELY sure you rename every case because, if you don't,
everything will still work just *fine*, except where you missed a case,
the code will reference the global and the compiler won't complain and
it might very well look like everything is working.

Either way, in my view, you had better review the code, have an
understanding of how it works, and make sure that the change you're
making is correct and makes sense, and that you've tested it well.

> >I wasn't suggesting to change the names of the global variables in this
> >specific case, though I could see that being a better approach in some
> >instances- but it really depends. Each case needs to be reviewed and
> >considered and the best approach taken.
> Again, depending on the case, whether the best approach is to promote structure creation, variable removal, and logic changes, for now, is really beyond my reach.

Then I'd suggest that you spend some time looking at each case and
working through what the best approach is and proposing patches that use
the best approach in each case. If you don't wish to spend time on
that, that's fine, but I don't agree with this approach of just pushing
through and making things changes just to satisfy a particular compiler
warning. I don't anticipate further discussion on this changing my mind
on this point.

> >I think we need to be looking at the changes and considering them, and
> >the person proposing the changes should be doing that and not just
> >expecting everyone else to do so.
> Again, I am considering only the range of my changes, which are minimal, so less likely to do something wrong, or hinder future development.

I've already pointed out why I don't think this is the right approach to
be addressing these issues, and it seems that you don't disagree with me
about the recommended changes I've suggested, you've just said that you
only want to think about or care about renaming of variables and I am
specifically saying that's not an acceptable approach to addressing
these issues.

> >I'd suggest that we work through that then and get you up to speed on
> >the structures and logic- the pg_dump code is pretty ugly but the
> >specific usage of numTables isn't too bad. Each of these should be
> >looked at independently and thought about "what's the right way to fix
> >this?" The right way isn't necessairly to just rename the variables, as
> >I was saying, and doing so may lead to more confusion, not less.
> This way it will take a long time to eliminate all name collisions.

Why is that an issue?

> And worse, in my opinion, will continue to be adding new cases, since there is no rule, so check if this happens in the current development.

Feel free to monitor the situation and complain about new patches which
are proposed that add to them. I don't have any particular problem with
that. Nor do I object to generally pushing forward with the goal of
eliminating the existing ones.

Let me lay this out in a different way- we could do the exact same thing
you're doing here by just mindlessly changing, right before we commit,
any variables that shadow global variables, we'd eliminate the compiler
error, but it doesn't directly make anything *better* by itself, and
ultimately isn't really all that different from the current situation
where the compiler is essentially doing this for us by manging the
variables as shadowing the globals thanks to C scoping rules, except
that we add in the possibility of mixing usage of the local and the
global throughout the functions therefore adding to the confusion.

> Not only are they global, there are dozens, perhaps hundreds of shadow local variables.

That doesn't actually make any of them bugs though.

> I was working on this second class of variables, which, in my opinion, would lead to less code, less bugs, and more security for the code, but I realize that my effort may not be worth it.

I'm all for working to eliminate these shadow variables, but, again, not
through rote renaming of the locals without putting in any real thought
about what the code is doing and working out what the right approach to
such a change to eliminate the shadow variables should be.

> >Having shadowed globals, while kinda ugly, doesn't necessairly mean it's
> >a bug. I'm not sure what "minor performance improvements" are being
> >claimed here but there's a whole lot of work involved in demonstrating
> >that a change is a performance improvement.
> I was referring to contributions like this:
> https://github.com/postgres/postgres/commit/91da65f4ac2837e0792071e42b2e2101059f1b1b
> and not specifically, performance improvements in this global unshadow patch.

Ok, so this specific "global unshadow patch" is all about bugs, but
without actually showing that there's actual bugs here, just that there
are shadowed variables... In which case, the real question is "is this
change an improvement to the code" and I'm arguing that just the act of
changing the variable names to avoid shadowing isn't necessairly a code
improvement- that has to be evaluated on a case-by-case basis. If
you're not going to do that evaluation then you're just throwing changes
at the community with the expectation that someone else will do the
analysis and decide if the changes are worthwhile or not and that
strikes me as not really being very helpful. I'd really rather work
towards patches that are clear improvements which have been well
considered by the proposer of the patch.

Thanks,

Stephen

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-12-11 20:10:51 Re: Windows buildfarm members vs. new async-notify isolation test
Previous Message Robert Haas 2019-12-11 18:35:26 Re: global / super barriers (for checksums)