Fundamentals of Web Application Development · DRAFTFreeman

Scopes, Closures, & Variables

You probably have a good deal of experience working with some of the concepts we will discuss in this chapter – whether or not you realized it at the time! We will define what scope and closure mean in the context of JavaScript so that we can then use this vocabulary to unambiguously describe the behavior of variables in the execution flow of our programs.

Lexical Scopes

Scope is simply the set of rules defining how variables may be accessed by their name (identifier) at certain places in the code. Like many modern languages, JavaScript employs lexical scope, which means that these access rules have to do with where the code is written.

Lexical scope is a fairly intuitive mechanism for us as authors: when writing a given line of code, the variables to which I have access are the ones I can “see” from that line, which includes all of the variables in the current scope and all of the variables in each of their containing “ancestor” scopes.

Consider these three snippets of code, and notice the “bubbles” of scope highlighted throughout.

lexical scope 1
lexical scope 2
lexical scope 3
Lexical Scopes

At the very “top” of every program is the global scope. Functions each define their own functional scope, and block scopes are defined by blocks such as if-else statements and for loops. These can be nested arbitrarily deep, and each scope is completely contained by its immediate outer scope – there is no such thing as a scope that exists only partially inside of another one, or that “overlaps” between multiple parent scopes.

Remember that the structure of these scopes is defined at the time of writing and remains the same throughout the execution of your program.

The structure of these lexical scopes defines how the JavaScript engine attempts to resolve variable name identifiers during the execution of your code. When you reference a variable for the engine to retrieve, it first performs a lookup for the given identifier in the current scope. If not found, it moves “up” and looks for the identifier in the parent scope. If still not found there, it moves up one more level… and so on, and so on, until it either finds a variable with that identifier (at which point it resolves the lookup with that value) or reaches the global scope. If the identifier is not found in the global scope, the lookup will be resolved with the value undefined.

When a variable in a given scope has the same identifier as a variable in an outer scope, the inner variable is said to “shadow” the outer one – that is, it “casts a shadow” on the outer variable(s) with the same identifier, which are no longer directly accessible by name because the engine always performs lookups starting from the current (innermost) scope before traveling “up” through the outer scopes, and will resolve the first time it finds a matching identifier.

Closure

Closure is another concept that, while fundamental to almost any program in JavaScript, is easy to use without ever realizing what it actually is, or that you’re even using it. Once you understand what closure is, you will suddenly see it everywhere in JavaScript code.

Simply put, closure is the ability of a function to “remember” and access its lexical scope as normal, even while it is executing in a context outside of its lexical scope.

This is incredibly important in a language with first-class functions, where we may be passing around references to functions themselves as we would any other kind of object or value, and invoking any given function in any context. Closure allows us, as authors, to reason about the scope of data to which we have access at any given point independently from where the function’s context of execution may happen to end up at run-time.

Consider this (very minimal) example:

function foo() {
  let a = 42;

  function innerFunc() {
    return a;
  }

  return innerFunc;
}

let bar = foo();
typeof a; // undefined
bar(); // 42

We declare a named function foo, inside of which is another named function, innerFunc. Because of closure, innerFunc is closed over its containing functional scope, which in this case is foo. In the program execution, we get the reference to innerFunc by calling foo(), and save this reference on the variable named bar. Later, we invoke the function by its reference on bar. Even though we’re now executing it in the global scope, where we do not have lexical access to the variables inside of foo, we still get back the value of a from inside of foo.

Boom. Closure.

Granted, the above example is quite contrived. However, you will quickly learn the power and usefulness of closures as you write bigger and bigger programs employing functional programming styles.

Variables

Now that we have an understanding of lexical scope and closure, we can discuss the kinds of variables that are used in JavaScript. Until 2015, there existed only one kind of variable: var. ES2015 introduced two new kinds, let and const. In the majority of cases, these latter two behave much more intuitively than their predecessor, and are so useful that we will examine them first. (Though, a good grasp on the behavior of vars is still fundamental to attaining a comprehensive understanding of JavaScript.)

let – Block Scope

Declaring a variable using let defines it inside of the current block scope (or function scope, if not inside of a block). For someone with experience in other object-oriented languages, let variables behave fairly intuitively.

For the code sample below, pretend that each thrown ReferenceError is immediately caught so that the rest of the code keeps executing. Also note that the comments inside of the function myFunc are the output that would be produced at the actual time of the function’s invocation.

console.log(b); // throws ReferenceError

b = 42; // throws ReferenceError (in strict mode)

let b = 42;

console.log(b); // 42

function myFunc() {
  let a;
  console.log(a); // undefined
  a = 52;
  console.log(a); // 52
  console.log(b); // 42 <-- b from outer scope
}

myFunc(); // <-- execute myFunc

console.log(a); // throws ReferenceError

for (let i = 0; i < 4; i++) {
  console.log(i); // <-- logs 0 // 1 // 2 // 3
}

console.log(i); // throws ReferenceError

As you can see, if we try to access a let variable before it has been declared, a ReferenceError is thrown. (Normally, this would stop execution at the point of throwing and propagate back up the execution context until caught.)

There are a few other things you may notice: we can declare a let variable without assigning it a value, at which point it will hold the value undefined until it is assigned otherwise.

Notice in the statements towards the bottom that the let variable i is defined only within the for loop block scope.

We are not allowed to declare two variables with the same identifier in the same scope:

let b = 42;
let b = 52; // throws SyntaxError

Because this is a SyntaxError, which is thrown at compilation time (usually right before the function is created in memory, depending on the engine), none of the code within this scope will actually run; even the value of 42 is never assigned to b.

Here’s an example of the shadowing we discussed earlier:

let b = 42;

function myFunc() {
  let b = 50;
  console.log(b); // 50
}
myFunc();

console.log(b); // 42

Inside of myFunc we only have lexical access to the “most closely-scoped” identifier b, which in this case has the value 50. Once myFunc has finished running and the execution context returns back to the outermost scope, we again have lexical access to the first b as before.

Consider the behavior of the below example, which is very slightly different. Look at line 4 in particular – what do you predict will be output when myFunc is invoked?

let b = 42;

function myFunc() {
  console.log(b); // <--- (?)
  let b = 50;
}
myFunc();

console.log(b); // 42
Show answer

Nothing is output; a ReferenceError is thrown.

Hmm… interesting. One might think that at the beginning of myFunc, before we declared an identifier b in our inner scope, we would still have lexical access to the outer b. This is not the case.

Behind the scenes, the declaration of b is virtually hoisted to the top of its containing block during compilation. This creates what is called a temporal dead zone for that variable identifier starting from the beginning of its containing block and ending at its associated let statement. At the let statement, the variable is then initialized with the given value (or undefined if none is given, as we previously saw). If we try to access the variable inside of this dead zone, whether to get or set its value, a ReferenceError is thrown. To visualize this hoisting process, consider the below code (left) and a virtually equivalent «pseudo-code» (right).

function myFunc() {  | «some block» {
                     |   «declare empty identifier b»
  /* lots of code */ |   «start b deadzone»
                     |
                     |   // lots of code
  let b;             |
                     |   «end b deadzone»
                     |   let b = undefined;
  /* more code */    |
                     |   // more code
  b = 42;            |
                     |   b = 42;
}                    | }

This behavior may seem odd at first, but it helps us to construct code that is less error-prone; the “mixed” lexical shadowing behavior we might have otherwise predicted would allow for the proliferation of an entire class of bugs that might slip past our human eyes (and maybe even our human-created static analysis tools).

const – Block Scope Constants

The keyword const is used to declare variables whose values should be constant throughout their lifetime. Just as with let, const variables are block-scoped, follow the same hoisting behavior (and thus also have a temporal dead zone), and cannot be declared more than once in a given lexical scope. However, const creates a read-only reference to its value. As such, its value must be assigned in the same statement in which it is declared.

const MY_CONST; // throws SyntaxError

If we attempt to assign a new value to a const after its initialization, a TypeError is thrown.

const MY_CONST = 42; // 42
MY_CONST = 50; // throws TypeError

As with a ReferenceError, a TypeError is thrown only when the line is executed – it does not affect the program at compilation time like SyntaxError.

After its assignment, a const variable will always contain the exact same value and cannot be re-assigned. It doesn’t matter what the value is – even if we try to re-assign the same value, it is still not allowed.

const MY_CONST = 42;
MY_CONST = 42; // throws TypeError

There’s a catch, though: this immutability constraint applies only to the single identifier-value reference binding created by const. If a const variable happens to reference an object, then that variable identifier will always reference that object, but the object itself will be mutable as normal.

const myConstObj = { b: 42 };
console.log(myConstObj.b); // 42
myConstObj.b = 50; // <--- allowed!
console.log(myConstObj.b); // 50
myConstObj = { b: 50 }; // throws TypeError

The use of const again provides us with more opportunities to catch bugs at author time, before software gets anywhere near our end users. If you are creating a variable to use later as a reference to a given value, but do not intend for its value to ever change in the current execution context, then you should (most likely) use const.

var – Functional Scope

And now we arrive at the one kind of variable that has existed in the language since its creation in 1995. Variables declared using the var keyword behave differently than let and const in a few important, and possibly unintuitive, ways.

vars have functional scope. They are bound to the lexical scope of their innermost containing function, regardless of whether their declaration appears inside of a further-nested block scope.

Let’s revisit some of the code that we used in our exploration of let, instead replacing all of the variable declarations with var.

var b = 42;
console.log(b); // 42
foo();
function foo() {
  var a;
  console.log(a); // undefined
  a = 52;
  console.log(a); // 52
  console.log(b); // 42 <-- b from outer scope
}

console.log(a); // throws ReferenceError

Well, so far it appears that var is behaving much like let. But now consider:

function bar() {
  for (var i = 0; i < 4; i++) {
    console.log(i); // <-- logs 0 // 1 // 2 // 3
  }

  console.log(i); // 4 <-- !
  if (true) {
    var b = 100;
    console.log(b); // 100
  }

  console.log(b); // 100 <-- !
}

It appears that the declarations of var i in the for loop and var b in the if statement block have stuck around even after the execution of the blocks has completed. This turns out to be true, in fact – here, the contents of these inner block scopes are still part of the same function scope.

Do not be fooled by code that uses vars but appears to be block-scoped. The following is valid:

function baz() {

  if(/* some condition */) {
    var message = 'Your number is 10 or less.';
   console.log(message);
  } else {
    var message = 'Your number is greater than 10.';
   console.log(message);
  }

}

While this code just happens to do what the author (probably) intended, we can tell that it was (hypothetically) written with an incorrect mental model which assumed that each var message is block-scoped in its respective if-else control block. When programs grow to non-trivial sizes, this apparent-but-incorrect pattern can camouflage bugs arising from var overwrites.

Because I told you that the above code is valid, you probably noticed that we can re-declare vars. In fact, re-declaring a var (without an assignment) has no effect on the variable.

var a = 42;
console.log(a); // 42
var a = 50; // no problem
console.log(a); // 50
var a; // Did we just re-declare it?
console.log(a); // 50 <--- hmm, I guess not.

On one hand, this is a nice safety net, especially for those new to the language or even without any prior background in programming. If we’re writing a huge function and use var i near the top, then later forget and use var i near the bottom, then – barring any bugs arising from incorrect assumption of i’s state – our program will happily chug along, not breaking, and our end user will be none the wiser.

On the other hand, I think you can imagine scenarios in which uncareful use of var might lead to quite unintentional behavior.

There is one more behavioral uniqueness with vars: their declarations and initialization are hoisted to the top of their containing functional scope – initialization, in this case, being an assignment of the value undefined. Because of this, vars have no temporal dead zone.

function foo() {
  console.log(a); // undefined <--- NOT a ReferenceError!
  var a = 42;

  console.log(a); // 42
}
foo();

function bar() {
  a = 84;
  console.log(a); // 84;
  var a;
  console.log(a); // 84
}
bar();

Now we can understand more clearly why re-declarations (without assignments) don’t “reset” the variable back to undefined – the repeated declarations are all hoisted at the same level before execution, effectively acting as a single functional-scope declaration.

Pattern: IIFEs

Combining many of the concepts we have now learned about scopes, variables, closures, and functions, let us briefly explore one of the most common patterns in JavaScript development: the Immediately-Invoked Function Expression, or IIFE (pronounced “iffy”). As the name suggests, an IIFE is a function expression which is immediately invoked, such as:

// Some outer scope

typeof myPublicFunc; // "undefined"

(function() {
  var myPrivateVar = 42;

  function myPrivateFunc(x) {
    // "hidden" algorithm
    return x * 2;
  }

  window.myPublicFunc = function(input) {
    return myPrivateFunc(input + myPrivateVar);
  };

  // OR global.myPublicFunc = ...
})();

typeof myPrivateVar; // "undefined"

typeof myPrivateFunc; // "undefined"

typeof myPublicFunc; // "function"

myPublicFunc(1); // 86

While the first-level function here may at first appear to be a function declaration, the parentheses enclosing it instead make it a function expression (so, it is parsed “inline” at run-time and is not hoisted). Continuing the statement, after parsing through the function body, the first set of parentheses closes and effectively “returns” the reference to this new (anonymous) function. That function is then immediately invoked by the last set of parenthesis.

After the function’s execution, we have no way of accessing it ever again. This means that we also have no way of accessing anything from its scope nor any of the nested scopes inside of it – unless, at some point in the function, it modified the state of existing variables in our outer scope (including, possibly, the global scope) to give us references with which we can “peek” inside of it and see certain parts via closure.

What’s the use in this? Well, remember, until mid-2015, the language did not (officially) have any notion of block scope, only function scope. IIFEs effectively provide an inline block-like scope because of their characteristics we just described above. Learned computer scientists such as yourselves know that carefully controlling scope is an important part of software security and stability. It is generally frowned upon to create variables that “pollute” higher scopes, especially the global scope. At the same time, you want to limit outside access to the inner variables and functions, and any respective state that they may contain. If your entire program – and all of the variables and logic used to construct it – ran in the same scope, you would more often than not step on your own toes (or the toes of a third-party script that you employ).

IIFEs provide a fairly succinct way to achieve these aims in a paradigm that has only a functional notion of scope and no notion of explicitly declaring variables as “public” or “private”. You will see this pattern commonly used in third-party libraries, which often start their own execution in the window or global scope after being loaded through a script tag in an HTML document.