Data Collections & Iteration
For most kinds of applications, a great deal of the main program logic is concerned with managing and manipulating collections of data. As we will see in this section, JavaScript provides some very handy ways to work with collections.
Built-In Collection Objects
Array
The most commonly used type of collection is the Array
. In fact, JavaScript
provides a unique shorthand literal syntax just for creating them.
let myArr = [42, 'cool', { a: 3 }, ['inner array!']];
myArr.length; // 4
myArr[0]; // 42
myArr[myArr[2].a]; // "inner array!"
typeof myArr; // "object"
Array.isArray(myArr); // true
Array.prototype.isPrototypeOf(myArr); // true
Arrays are simply objects. However, they inherit a great deal of functionality
from the global built-in Array.prototype
.
let myArr = [3, 5, 7, 9, 12];
myArr.pop(); // 12 // now [3, 5, 7, 9]
myArr.push(11); // 5 <-- (new array length) // now [3, 5, 7, 9, 11]
myArr.shift(); // 3 // now [5, 7, 9, 11]
There are many more simple array manipulation methods – for a comprehensive
list, see MDN’s reference for Array
.1 We will explore some of
the more interesting and useful methods later in this chapter.
Set
A Set
is a collection of unique values.
let mySet = new Set();
mySet.add(42);
mySet.add('cool');
mySet.add(42); // <-- duplicate, silently ignored
mySet.has(42); // true
mySet.size; // 2
mySet.delete('cool');
mySet.size; // 1
A nice characteristic of the built-in iterable object types is that they can be
easily converted between one another. For example, this is especially useful if
we need to de-duplicate the entries of an Array
.
let myArr = [3, 5, 7, 9, 5, 3];
let mySet = new Set(myArr); // <-- from existing iterable
let dedupedArr = Array.from(mySet);
console.log(dedupedArr); // [ 3, 5, 7, 9 ]
Map
The third collection object type is Map
, which is a collection of key-value
pairs. As you may realize, we already have a map-like mechanism: regular ol’
objects. Why would we need Map
objects, then? Most of the time we don’t, but
they do have a few more capabilities than regular objects that make them suited
for certain use cases.
Map
s are iterable, so they can be used in many places interchangeably with
Array
s or Set
s. Map
s also use instance methods to manipulate and search
over them, such as add
, get
, and delete
. Whereas regular object property
names are strictly string
or symbol
primitives, keys in Map
entries can be
values of any types, including objects.
Iterables
Very rarely in JavaScript will you ever need to write out a for
-loop in its
long, imperative form.
let myData = [
/* big data array */
];
for (let i = 0; i < myData.length; i++) {
let element = myData[i]; // do something with the element
}
Often, we don’t really care about keeping track of the index; we just want to
iterate over all of the elements. JavaScript provides a nice syntactic shorthand
called the for..of
loop which can be used with any iterable.
let myArr = [42, 'cool', { a: 3 }];
for (let element of myArr) {
console.log(typeof element);
}
// Logs:
// "number"
// "string"
// "object"
Functional Iteration
Array
s provide mechanisms for iteration which behave well in a functional
style of programming. The .forEach()
method executes a given callback function
once per array element. Let’s look at the method’s signature:
arr.forEach(callback[, thisArg])
The callback function we provide takes three optional arguments: the current
element value, the current index, and a reference to the array itself.
.forEach()
optionally takes a second argument, thisArg
, which will be bound
as the value of this
during execution of the callback function.
let myArr = [42, 'cool', { a: 3 }];
let myFunc = function(element, index) {
console.log(element, 'is at index', index);
};
myArr.forEach(myFunc);
// Logs:
// 42 "is at index" 0
// "cool" "is at index" 1
// { a: 3 } "is at index" 2
Unlike for..of
loops, there is no way to break from forEach
iteration. If
you need the ability to break, you can always use one of the loop controls,
though Array
s also provide convenient functional-style methods that negate the
need for breaks, such as when searching for a value. Most Array
iteration
methods accept a callback that is passed three arguments as we just described
for the .forEach()
method.
.some()
lets us determine whether any elements in an array pass a given test.
Similarly, .every()
lets us determine whether all elements in an array pass a
given test. .find()
can use the same test to return the value of the first
element which passes the test. The test in these cases is simply the callback
function we provide, which should return a truthy value for elements that “pass”
and falsy value for those that “fail”.
let myArr = [42, 'cool', { a: 3 }];
let isString = function(element) {
if (typeof element === 'string') {
return true;
} else {
return false;
}
};
myArr.some(isString); // true
myArr.every(isString); // false
myArr.find(isString); // "cool"
Internally, these methods immediately return (stop iteration) the first time
they find an element which passes the test function (or fails, for .every()
),
for the same performance reason we would have used break
inside of a regular
for
loop.
When working with very large collections of data, three functions are of great
use: .filter()
, .map()
, and .reduce()
. To explore these and see how they
work together, let’s bring back our scenario of writing an application to help
manage a pet shop. For the remainder of the code examples in this section, we’re
going to use a dataset that looks like this:
const animals = [
{ name: 'Leo', species: 'dog', age: 8 },
{ name: 'Snowball', species: 'cat', age: 6 },
{ name: 'Polly', species: 'bird', age: 2 },
{ name: 'Goldey', species: 'fish', age: 1 },
{ name: 'Fido', species: 'dog', age: 3 },
{ name: 'Kitty', species: 'cat', age: 4 },
{ name: 'Squawks', species: 'bird', age: 12 },
{ name: 'Bubbles', species: 'fish', age: 2 }, // plus any more animals you can imagine
];
Let’s say that our user only wants to look at dogs for now. We can use the
.filter()
method to create a new array which contains only elements which pass
a given test.
function isDog(animal) {
if (animal.species == 'dog') {
return true;
} else {
return false;
}
}
let dogs = animals.filter(isDog);
console.log(dogs);
// [{ name: 'Leo', species: 'dog', age: 8 },
//{ name: 'Fido', species: 'dog', age: 3 }]
It’s important to know that the new array returned by .filter()
contains
shallow copies of the elements from the original array; that is, only the
immediate value of the element is copied. For primitives, this is effectively a
duplicate, but for elements that are objects, only the object reference is
copied. An object element in the filtered array will point to the exact same
object in memory as its respective element in the original array.
dogs[0] === animals[0]; // true
Here’s another task: the user wants a list of all the animals’ names. We can use
the .map()
function to map each element in the original array to a transformed
value in a new array. We supply the transformation as a callback function which
takes the now-familiar (element, index, array)
arguments.
function justName(element) {
return element.name;
}
let names = animals.map(justName);
console.log(names);
// [ "Leo", "Snowball", "Polly", "Goldey", ... ]
Because .filter()
and .map()
each return a new array, they are easily
chainable. Imagine that our user just wants the names of all dogs. Keeping the
functions that we just defined above,
let dogNames = animals.filter(isDog).map(justName);
console.log(dogNames); // [ "Leo", "Fido" ]
This is a very common pattern, and is especially useful if we are not interested
in keeping around intermediate results from successive applications of .map()
and .filter()
.
Array
s also have a .sort()
method. By default, the array is sorted by
ascending Unicode code point values of the string representation of elements’
values.
console.log(names); // [ "Leo", "Snowball", "Polly", "Goldey", // "Fido", "Kitty", "Squawks", "Bubbles" ];
names.sort();
console.log(names); // [ "Bubbles", "Fido", "Goldey", "Kitty", // "Leo", "Polly", "Snowball", "Squawks" ]
Notice that sorting an array does not make a copy; it changes the order of
elements for the array on which it is called. (.sort()
will still return a
reference to the array itself, though, so it is chainable along with other array
methods.)
The .sort()
function takes one optional parameter: a comparator, or
comparison function. The comparator is given two arbitrary array elements as
arguments, element “A” and element “B”. The comparator’s only job is to
determine which of the two elements is “greater” and which is “lesser” (or if
they should be considered the same) for purposes of sorting. The comparator
should return a positive number if A should be greater, a negative number if A
should be lesser, and 0
if A and B should be the same.
Here’s a use case: the user wants a list of all of the animals ordered by age, from youngest to oldest. (We’ll work on a shallow copy of the original array so we don’t modify its order.)
let animalsCopy = Array.from(animals); // shallow copy
function byAge(a, b) {
if (a.age > b.age) return 1;
else if (a.age < b.age) return -1;
else return 0;
}
animalsCopy.sort(byAge);
console.log(animalsCopy);
/* [ { name: "Goldey", species: "fish", age: 1 }
{ name: "Polly", species: "bird", age: 2 },
...
{ name: "Leo", species: "dog", age: 8 },
{ name: "Squawks", species: "bird", age: 12 } ] */
Arrow functions are especially useful when chaining together functional-style data management calls. We can express the program’s functionality in our code more succinctly and legibly.
Say we want to get a list of the names of all cats over the age of 2, in alphabetic order. Instead of defining a named function to use for each callback, we can use inline arrow functions.
let names = animals
.filter(a => a.species == 'cat')
.filter(a => a.age >= 2)
.map(a => a.name)
.sort();
console.log(names); // [ "Kitty", "Snowball" ]
Logically, we could even combine the first two filters:
let names = animals
.filter(a => a.species == 'cat' && a.age >= 2)
.map(a => a.name)
.sort();
console.log(names); // [ "Kitty", "Snowball" ]
Nice! We can use this style to compose any number of steps needed to filter, map, or otherwise iterate over collections of data in a way that is immediately clear and easy to follow when looking at the code. Using arrow functions in this way has a few advantages over using named function declarations:
- The function logic is expressed roughly where it is being executed in the program flow, aiding in legibility and developer comprehension.
- As arrow functions are anonymous, they do not unnecessarily pollute (expose themselves to) the surrounding lexical scope.
For non-trivial element structure complexity and manipulation needs, there will usually be many different ways to compose these kinds of data manipulation methods. There’s no right or wrong way, though it is important to keep both performance and readability in mind. With modern JavaScript engine optimizations, it is unlikely that small micro-optimization tweaks on your part will have much, if any, effect.
Normal Objects as Collections
TODO
The for..in
loop iterates over the names of all enumerable properties (whose
keys are not symbols), including inherited properties.
let obj = { a: true, b: 123, c: 'abc' };
for (let key in obj) {
console.log(key);
}
> "a"
> "b"
> "c"
> "a = true"
> "b = 123"
> "c = abc"
The static method Object.entries
accepts an object and returns an array of
entries, where each entry is a two-element array containing the
property’s name and value, respectively.
let obj = { a: true, b: 123, c: 'abc' };
for (let [key, value] of Object.entries(obj)) {
console.log(`${key} = ${value}`);
}