Random Post: Shadow of the Colossus
RSS .92| RSS 2.0| ATOM 0.3
  • Home
  • About
  •  

    The Problem With Implicit Variable Declarations

    September 16th, 2008

    Implicit variable declarations (ex: Visual Basic without “Option Explicit”) can seem like a good idea if you look at them a certain way. I can understand what language designers are probably thinking when they decide to implement implicit declarations.

    “Let the computer do the work for you.”

    That’s a favorite mantra of mine, one of my cardinal rules. To a programmer, a computer with a compiler is like having an army of robots at your disposal. Why do things manually when you send send off a drone or a function or a script to do it for you while you get on with other things? So whenever possible, automate whatever you can. Efficiency: it’s nice.

    So I think I can understand what language designers are thinking when they decide to include the feature of implicit declarations. They’re thinking “Hey, why should I have to explicitly point out that I’m going to use a variable? If I’m using the variable somewhere in the code, the computer should be able to figure out on it’s own that the label is supposed to be a variable. Saves me the bother.” Sounds good. Same result, less effort. Efficiency. Nice.

    But there’s an ugly assumption hidden in that reasoning. What that developer is really saying is “Hey computer, anytime you come across an undeclared label being used as a variable, just go ahead and assume it’s a new variable”. We all know what happens when we make assumptions, right? “ASSumptions will make an ass outta ya.”

    What do you really want to happen when you mistype a variable name? (Oh, sure, you’d never do anything like that? Riiight? It happens to the best. Deal with it.) There’s two choices: A. The compiler grabs you, points at your error, and yells “Hey! You screwed this up! Fix it!”. or B. The compiler pretends everything is ok, processes the bad code, and leaves you with a bug which, only if you’re very luckly, will manifest itself immediately and in a way that makes the exact nature and location of the problem obvious. Hmm, which is a better way to write code…?

    “But I might have really wanted it to be a new variable!” Ok fine. How about you go rewrite your “rm” command to never ask for confirmation just because, well hey, we can’t let the potential downfall of accidentially loosing files force us to endure a much, much lesser inconvenience when we really do want to delete. Doesn’t make much sense does it? Point being: Assumptions are bad. Bad enough even to outweigh convenience.

    Language design, and heck, API design in general, are more than just programming. There is code involved, yes, but there’s a large amount of psychology that needs to go into it as well. When you’re writing an ordinary function, you’re basically thinking “automation” (ie, that “army of robots”). But when you’re designing an interface for programmers, even yourself, the real important thing suddenly becomes “How can I prevent the programmer from messing up?”. It’s like designing any interface, the weakest link is always the human factor. Even if it’s the best programmer in the world, a highly unreliable computer will still make that human look like a giant pile of shoddy engineering. So it’s your job, as the language/interface/API designer, to do whatever you can to minimize that risk of programmer error.

    Another way to think of the issue is in terms of “good redundancy versus bad redundancy”. There tends to be a lot of value placed on eliminating redundancy. Often this is good. But sometimes redundancy can improve reliability, which is a very important concern. Manditory explicit variable declarations are one form of “good redundancy” that improves reliability. Walter Bright explains it best:

    “Variable declarations are one [example of good redundancy in langauge design]. But since the compiler can figure the need for declarations from the context, declarations seem like prime redundancies that can be jettisoned. This is called implicit variable declaration. It sounds like a great idea, and it gets regularly enshrined into new languages. The problem is, the compiler cannot tell the difference between an intended new declaration and a typo - and the poor maintenance programmer can’t tell, either. After a while, though, the lesson about redundancy is learned anew, and a special switch will be added to require explicit declaration.”
    - Walter Bright: Redundancy in Programming Languages

    Visual Basic developers learned this lesson a long time ago. Even though VB supports implicit variable declarations, it’s extremely rare to come across a professional VB developer that doesn’t strongly recommend turning it off (with “Option Explicit”) and religiously does so in their own code. It’s a shame there are so many newer languages that haven’t learned from this.


    Putting the “Engineering” back into “Software Engineering”

    September 16th, 2008

    (I wrote this a while ago, but didn’t post it for some reason. So I’m posting it now.)

    I’ve been very vocal about my distaste towards many of the features of various dynamic programming languages (at least for the purposes of any non-trivial program). Recently, while reading the first chapter of “Practical Cryptography” (by Niels Ferguson and Bruce Schneier), it occurred to me that authors’ explanation of “The Evils of Performance” was very relevant to my opinions of these languages.

    In that first chapter, the authors stress the importance of security being made the single top priority. They make this point by looking at security as an engineering discipline. As they point out, good engineering has always been about making safety and reliability the primary concerns. No other concern should ever be optimized to a point where it could interfere with safety or reliability.

    The authors go on stating that, in the same way, the computer industry needs to prioritize security far ahead of efficiency. To illustrate, they present the explanation: “We already have enough fast, insecure systems. We don’t need another one.”

    I’d argue that in computer programing, reliability deserves a similar status ahead of efficiency. Just as in other forms of engineering though, this doesn’t just mean placing reliability ahead of the actual product’s efficiency. This also means placing it ahead of the efficiency of the development process itself. We already have enough unreliable software being churned out.

    Which brings me back to dynamic programming languages: The reason I so strongly dislike many (albeit, not all) of the characteristics of these languages is because they treat short-term programmer productivity as a holy grail (sometimes even stating it as the single primary goal), while allowing good engineering principles like reliability to fall by the wayside. The real irony, though, is that maintaining an unreliable program is itself a drain on programmer productivity, thus hijacking any long-term productivity gains. So for any non-trivial project, these languages would have been more productivity-friendly by going the engineering route and making design decisions that focused on aiding the creation of reliable software at a reasonable speed rather than potentially buggy software at rapid speeds.