Data Collections & Iteration

contents

For most kinds of applications, a great deal of the main program logic is concerned with managing and manipulating collections of data. As we will see in this section, JavaScript provides some very handy ways to work with collections.

Built-In Collection Objects

`Array`

The most commonly used type of collection is the Array. In fact, JavaScript provides a unique shorthand literal syntax just for creating them.

let myArr = [42, 'cool', { a: 3 }, ['inner array!']];

myArr.length; // 4
myArr[0]; // 42
myArr[myArr[2].a]; // "inner array!"
typeof myArr; // "object"
Array.isArray(myArr); // true
Array.prototype.isPrototypeOf(myArr); // true

Arrays are simply objects. However, they inherit a great deal of functionality from the global built-in Array.prototype.

let myArr = [3, 5, 7, 9, 12];

myArr.pop(); // 12 // now [3, 5, 7, 9]
myArr.push(11); // 5 <-- (new array length) // now [3, 5, 7, 9, 11]
myArr.shift(); // 3 // now [5, 7, 9, 11]

There are many more simple array manipulation methods – for a comprehensive list, see MDN’s reference for Array.¹ We will explore some of the more interesting and useful methods later in this chapter.

`Set`

A Set is a collection of unique values.

let mySet = new Set();

mySet.add(42);
mySet.add('cool');
mySet.add(42); // <-- duplicate, silently ignored
mySet.has(42); // true
mySet.size; // 2
mySet.delete('cool');

mySet.size; // 1

A nice characteristic of the built-in iterable object types is that they can be easily converted between one another. For example, this is especially useful if we need to de-duplicate the entries of an Array.

let myArr = [3, 5, 7, 9, 5, 3];

let mySet = new Set(myArr); // <-- from existing iterable
let dedupedArr = Array.from(mySet);

console.log(dedupedArr); // [ 3, 5, 7, 9 ]

`Map`

The third collection object type is Map, which is a collection of key-value pairs. As you may realize, we already have a map-like mechanism: regular ol’ objects. Why would we need Map objects, then? Most of the time we don’t, but they do have a few more capabilities than regular objects that make them suited for certain use cases.

Maps are iterable, so they can be used in many places interchangeably with Arrays or Sets. Maps also use instance methods to manipulate and search over them, such as add, get, and delete. Whereas regular object property names are strictly string or symbol primitives, keys in Map entries can be values of any types, including objects.

Iterables

Very rarely in JavaScript will you ever need to write out a for-loop in its long, imperative form.

let myData = [
  /* big data array */
];

for (let i = 0; i < myData.length; i++) {
  let element = myData[i]; // do something with the element
}

Often, we don’t really care about keeping track of the index; we just want to iterate over all of the elements. JavaScript provides a nice syntactic shorthand called the for..of loop which can be used with any iterable.

let myArr = [42, 'cool', { a: 3 }];

for (let element of myArr) {
 console.log(typeof element);
}

// Logs:
// "number"
// "string"
// "object"

Functional Iteration

Arrays provide mechanisms for iteration which behave well in a functional style of programming. The .forEach() method executes a given callback function once per array element. Let’s look at the method’s signature:

arr.forEach(callback[, thisArg])

The callback function we provide takes three optional arguments: the current element value, the current index, and a reference to the array itself. .forEach() optionally takes a second argument, thisArg, which will be bound as the value of this during execution of the callback function.

let myArr = [42, 'cool', { a: 3 }];

let myFunc = function(element, index) {
 console.log(element, 'is at index', index);
};

myArr.forEach(myFunc);

// Logs:
// 42 "is at index" 0
// "cool" "is at index" 1
// { a: 3 } "is at index" 2

Unlike for..of loops, there is no way to break from forEach iteration. If you need the ability to break, you can always use one of the loop controls, though Arrays also provide convenient functional-style methods that negate the need for breaks, such as when searching for a value. Most Array iteration methods accept a callback that is passed three arguments as we just described for the .forEach() method.

.some() lets us determine whether any elements in an array pass a given test. Similarly, .every() lets us determine whether all elements in an array pass a given test. .find() can use the same test to return the value of the first element which passes the test. The test in these cases is simply the callback function we provide, which should return a truthy value for elements that “pass” and falsy value for those that “fail”.

let myArr = [42, 'cool', { a: 3 }];

let isString = function(element) {
  if (typeof element === 'string') {
    return true;
  } else {
    return false;
  }
};

myArr.some(isString); // true
myArr.every(isString); // false
myArr.find(isString); // "cool"

Internally, these methods immediately return (stop iteration) the first time they find an element which passes the test function (or fails, for .every()), for the same performance reason we would have used break inside of a regular for loop.

When working with very large collections of data, three functions are of great use: .filter(), .map(), and .reduce(). To explore these and see how they work together, let’s bring back our scenario of writing an application to help manage a pet shop. For the remainder of the code examples in this section, we’re going to use a dataset that looks like this:

const animals = [
  { name: 'Leo', species: 'dog', age: 8 },
  { name: 'Snowball', species: 'cat', age: 6 },
  { name: 'Polly', species: 'bird', age: 2 },
  { name: 'Goldey', species: 'fish', age: 1 },
  { name: 'Fido', species: 'dog', age: 3 },
  { name: 'Kitty', species: 'cat', age: 4 },
  { name: 'Squawks', species: 'bird', age: 12 },
  { name: 'Bubbles', species: 'fish', age: 2 }, // plus any more animals you can imagine
];

Let’s say that our user only wants to look at dogs for now. We can use the .filter() method to create a new array which contains only elements which pass a given test.

function isDog(animal) {
  if (animal.species == 'dog') {
    return true;
  } else {
    return false;
  }
}

let dogs = animals.filter(isDog);

console.log(dogs);
// [{ name: 'Leo',  species: 'dog',  age: 8 },
//{ name: 'Fido', species: 'dog',  age: 3 }]

It’s important to know that the new array returned by .filter() contains shallow copies of the elements from the original array; that is, only the immediate value of the element is copied. For primitives, this is effectively a duplicate, but for elements that are objects, only the object reference is copied. An object element in the filtered array will point to the exact same object in memory as its respective element in the original array.

dogs[0] === animals[0]; // true

Here’s another task: the user wants a list of all the animals’ names. We can use the .map() function to map each element in the original array to a transformed value in a new array. We supply the transformation as a callback function which takes the now-familiar (element, index, array) arguments.

function justName(element) {
  return element.name;
}

let names = animals.map(justName);

console.log(names);
// [ "Leo", "Snowball", "Polly", "Goldey", ... ]

Because .filter() and .map() each return a new array, they are easily chainable. Imagine that our user just wants the names of all dogs. Keeping the functions that we just defined above,

let dogNames = animals.filter(isDog).map(justName);

console.log(dogNames); // [ "Leo", "Fido" ]

This is a very common pattern, and is especially useful if we are not interested in keeping around intermediate results from successive applications of .map() and .filter().

Arrays also have a .sort() method. By default, the array is sorted by ascending Unicode code point values of the string representation of elements’ values.

console.log(names); // [ "Leo", "Snowball", "Polly", "Goldey", //  "Fido", "Kitty", "Squawks", "Bubbles" ];
names.sort();

console.log(names); // [ "Bubbles", "Fido", "Goldey", "Kitty", //   "Leo", "Polly", "Snowball", "Squawks" ]

Notice that sorting an array does not make a copy; it changes the order of elements for the array on which it is called. (.sort() will still return a reference to the array itself, though, so it is chainable along with other array methods.)

The .sort() function takes one optional parameter: a comparator, or comparison function. The comparator is given two arbitrary array elements as arguments, element “A” and element “B”. The comparator’s only job is to determine which of the two elements is “greater” and which is “lesser” (or if they should be considered the same) for purposes of sorting. The comparator should return a positive number if A should be greater, a negative number if A should be lesser, and 0 if A and B should be the same.

Here’s a use case: the user wants a list of all of the animals ordered by age, from youngest to oldest. (We’ll work on a shallow copy of the original array so we don’t modify its order.)

let animalsCopy = Array.from(animals); // shallow copy

function byAge(a, b) {
  if (a.age > b.age) return 1;
  else if (a.age < b.age) return -1;
  else return 0;
}

animalsCopy.sort(byAge);

console.log(animalsCopy);

/* [ { name: "Goldey", species: "fish", age: 1 }
     { name: "Polly", species: "bird", age: 2 },
     ...
     { name: "Leo", species: "dog", age: 8 },
     { name: "Squawks", species: "bird", age: 12 } ] */

Arrow functions are especially useful when chaining together functional-style data management calls. We can express the program’s functionality in our code more succinctly and legibly.

Say we want to get a list of the names of all cats over the age of 2, in alphabetic order. Instead of defining a named function to use for each callback, we can use inline arrow functions.

let names = animals
  .filter(a => a.species == 'cat')
  .filter(a => a.age >= 2)
  .map(a => a.name)
  .sort();

console.log(names); // [ "Kitty", "Snowball" ]

Logically, we could even combine the first two filters:

let names = animals
  .filter(a => a.species == 'cat' && a.age >= 2)
  .map(a => a.name)
  .sort();

console.log(names); // [ "Kitty", "Snowball" ]

Nice! We can use this style to compose any number of steps needed to filter, map, or otherwise iterate over collections of data in a way that is immediately clear and easy to follow when looking at the code. Using arrow functions in this way has a few advantages over using named function declarations:

The function logic is expressed roughly where it is being executed in the program flow, aiding in legibility and developer comprehension.
As arrow functions are anonymous, they do not unnecessarily pollute (expose themselves to) the surrounding lexical scope.

For non-trivial element structure complexity and manipulation needs, there will usually be many different ways to compose these kinds of data manipulation methods. There’s no right or wrong way, though it is important to keep both performance and readability in mind. With modern JavaScript engine optimizations, it is unlikely that small micro-optimization tweaks on your part will have much, if any, effect.

Normal Objects as Collections

TODO

The for..in loop iterates over the names of all enumerable properties (whose keys are not symbols), including inherited properties.

let obj = { a: true, b: 123, c: 'abc' };

for (let key in obj) {
  console.log(key);
}

> "a"
> "b"
> "c"
> "a = true"
> "b = 123"
> "c = abc"

The static method Object.entries accepts an object and returns an array of entries, where each entry is a two-element array containing the property’s name and value, respectively.

let obj = { a: true, b: 123, c: 'abc' };

for (let [key, value] of Object.entries(obj)) {
  console.log(`${key} = ${value}`);
}

developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array
↩