A Common Sense Guide to DataStructures and Algorithms  Jay Wengrow¶
Note
This is one of the best beginner level books on the topic that I’ve encountered. In particular, the sections on Big O and Algorithmic performance come to mind, I don’t know that I’d recommend rereads of the book, but I would most certainly recommend anyone who is new to the topic to read this book end to end, with a pen in tow to make notes right in the book.
It’s that good.
Chapter 1  Why Data Structures Matter¶
The Array¶
An array is a contiguous dataset. While we might say, in Python:
1 2  some_array = [10, 21, 2, 0, 8, 72, 8]
print(some_array[3])

Internally, the machine stores the length of the array, and the position of the first item.
Then, the computer can access the nth item by offsetting from the first. So to get the first item, you offset 0, to get the second, you offset 1.
This is why you get the 0based indexing everywhere. Makes a lot of sense now that I think about it. I wonder how Visual Basic manages the 1based index.
So for the computer, to access an item in a list, it is a single step. If you ask for the nth item, all the program needs to do is offset the memory position by n and return the value.
Note
If the array’s length < n, then you have a stack overflow. The program is trying to access memory that is not its own. It’s like trying to overdraw from your bank vault and reaching into the vault next to yours.
In a data structure, you perform the following operations:
Read
Search
Insert
Delete
When we want to measure the speed of these operations, this is where the entire idea of how a data structure is organized comes in. This is why some data structures are suitable for some usages, while some are not.
Note
This book’s section on data structures and why Array indexing is O(1) is quite good. Do give it a reread again.
For an array of size N:
Getting item at index n: 1 step
Setting item at index n: 1 step
Appending 1 item to the end: 1 step
Prepending 1 item to the beginning: N+1 steps
Deleting an item in the list: N steps (1 step is the best case scenario, when you delete the last item)
Tip
For algorithmic speed measurement, always be a pessimist. Consider only the worst case scenario.
The Set¶
A set is an array of unique values. There are no duplicates.
Note
This is a generalization, but we’ll get to that in time.
Most set operations are identical to arrays. However, insertion is where sets and arrays differ.
Appending or prepending and item into a set of N items will take N steps. That’s because a set
needs to fulfil its prime objective: no duplicates. To insert an item i
into a set N,
you will first need to search the entire set for the occurrence of i
, and only when it
doesn’t exist, do you insert it into the set.
Chapter 2  Why Algorithms Matter¶
While datastructures have their own performance attributes, so do algorithms in general.
Ordered Arrays¶
An ordered array is almost identical to the classic array*, but the one caveat is that in an ordered array, the items are in order.
This has its benefits, but let’s look at its performance.
Read  1 step
Insertion  N + 2 steps
Deletion  N steps
Insertion again takes N + 2 steps because the item needs to be inserted into the right location. You cannot just insert an item at the very end.
Searching Ordered Arrays¶
While searching normal arrays is a matter of looking at each and every item, this is not necessarily so with an ordered array. Instead, we can use Binary Search.
Note
All source is going to be placed in their respective topic folder.
Binary Search will halve its steps each time it compares values to the search value. So there will be \(log_2N\) steps involved. It narrows its results down into a binary answer, hence the name.
Tip
To understand what \(log_2N\) means, or what \(log_MN\) means, read The Algorithm Design Manual’s chapter on Logarithms.
Q: For an ordered array with 100 elements, binary search takes 7 steps. How many steps does it take for 200 elements?
A: 8 steps
Tip
Every time the number of elements doubles, the steps involved in Binary Search increases by 1.
Tip
The Logarithm of N to the base M shows how many times N can be divided by M.
\(log_2N\) shows how many times N can be halved.
\(log_{10}N\) shows how many times N can be divided into tenths.
Chapter 3  O Yes! Big O Notation¶
Note
This book ignores most of the mathematical basis for Big O, and explains stuff rather simply. I recommend this section a lot. However, for a mathematical explanation, read The Algorithm Design Manual’s sections on Big O.
Big O measures how many steps an algorithm will take in relation to its input size.
If there are N data elements, how many steps will the algorithm take?
\(O(N)\) says that the answer to the key question is that the algorithm will take N steps.
Warning
If you have preconceived notions about Big O, please leave them at the door, for now.
Consider the following:
1 2 3 4  def do_something(arr):
length = len(arr)
mid = arr[length // 2]
print(f"{length=}, {mid=}")

How many steps is does this code example take?
The answer is 3.
What is Big O notation?
\(O(1)\).
The soul of Big O is what Big O is truly concerned about: how will analgorithm’s performance change as the data increases?
Big O doesn’t care how many steps the algorithm actually takes. It instead cares about the rate of change of an algorithm given change in the input data.
\(O(1)\) and \(O(3)\) mean the same to Big O. This algorithm has no relationship with the input data. It will not get slower or faster given more or less data.
\(O(N)\) is a different matter, however. It means that the algorithm has a linear relationship with the data.
1 2 3 4 5 6 7  def do_something(arr):
length = len(arr)
mid = arr[length // 2]
print(f"{length=}, {mid=}")
for element in arr:
print(f"{element=}")
print(f"{2*element=}")

Here, the algorithm seems to take \(O(2N+3)\) steps. \(2N+3\) is a linear polynomial. Wait, I promised no mathematics. It means that it is a statement that can be conceived as a straight line on a graph.
To Big O, the number next to \(N\) does not matter. Neither does the value 3 that is at the end.
In Big O terms, the above program has a runtime of \(O(N)\).
Big O is also a worstcasescenario measure. For an algorithm, always consider the worst case scenario when denoting it in Big O terms. Linear search might take 1 step if your first item is the item you were looking for, but in an array of size N, if it’s the last item you’re looking for, you need N steps. Big O in this scenario is \(O(N)\), not \(O(1)\).
Big O concerns itself with Logarithms quite a lot. Logarithms are the opposite of exponents.
\(2^3\) is 8. Then, \(log_28\) is 3.
\(10*2\) is 100. Then, \(log_{10}100\) is 2.
An alternative way of thinking about logarithms is: how many times do we need to halve (or thirds, or tenths, depending on the :math:`log_NM` base N) a number to get 1?
\(O(log_2N)\), for brevity: \(O(logN)\), measures how many times we need to divide a dataset into halves to arrive at our result in the worst case scenario.
Chapter 4  Speeding Up Your Code with Big O¶
Note
Read up on the implementation of Bubble Sort.
Bubble Sort has the following steps: comparisons, and swaps. Note how many times it loops through an array.
No, not twice. For each element, it loops through the entire array. In fact, it doesn’t know if an array is sorted until it loops through the array at least once.
Welcome to \(O(N^2)\). You do not want to be here.
Any polynomial with a power of 2 is a quadratic polynomial. On a graph, it would look like a parabola, but not one that ever touches the ground like someone throwing a ball. No, this is an upwardsfacing parabola. That means if you have 1 million data points, your algorithm will need to process \(1000000^2\) items. A million squared.
Any algorithm wherein you loop through the entire list, or even portions of the list but perhaps have a worstcase scenario where you loop through the entire list is of this order. If you can avoid it, do not implement one of these.
\(O(N^3)\) exists. Just pray that you do not meet him or his cousins.
Chapter 5  Optimizing Code With and Without Big O¶
Note
Read up on the implementation of Selection Sort.
In Selection Sort, you loop through the array to find the smallest item and swap it with the first item. This first item is, in fact, a cursor which moves ahead and marks the sorted part of the array.
For each item in the array, Selection Sort looks ahead at the rest of the items in the array, comparing them and seeing if there’s something smaller than this item. It then selects the smallest item, if it is not the current item, and swaps it with the current item.
This way, after the first passthrough, the smallest item is right at the front. The algorithm ensures that at the Nth passthrough, the first N items are in order at the front of the array.
So let’s look at Selection Sort and Bubble Sort together.
Selection Sort has to look at the rest of the items for each item in the list.
Bubble Sort has to look at the entire list for each item in the list.
While this seems straight forward, and no one will fault you for saying Selection Sort is faster (exactly half the runtime) than Bubble Sort, however, this does not matter to Big O.
But here’s the funny thing in the world of Big O Notation: Selection Sortand Bubble Sort are described in exactly the same way.
Tip
Remember to leave your knowledge or preconceived notions of Big O at the door. All you need to do is check this:
Does the algorithm have nested loops?
The exact time does not matter.
Why?
Big O Notation ignores constants.
It is easy to say Selection Sort is a \(O(N^2/2)\) algorithm. It would seem common sense.
However, consider the fact that the length of the array could grow astronomically. If you have an infinitely large array, then it does not matter.
\(O(N^2/2)\) is just \(O(N^2)\).
\(O(N/2+50000)\) is just \(O(N)\).
\(O(100000)\) is just \(O(1)\).
Big O Categories¶
Big O ignores the constants because it uses categories.
Big O Notation only concerns itself with general categories of algorithm speeds.
If we were to compare two buildings, one of which is a singlefamily home andone of which is a skyscraper, it becomes almost moot to mention how many floors each one has.Because the two buildings are so incredibly different in their sizes and functions.
This is the idea behind Big O. It categorizes algorithms into buckets wherein the bucket itself is a definitive measure of how quickly the algorithm grows with input data.
It won’t matter if your \(O(1)\) algorithm takes 10 minutes to run or if your \(O(N^2)\) takes 1 \({\mu}s\), at least not to Big O, because it doesn’t care how long each step takes, only how many steps there will be in relation to larger data set sizes.
Talking about \(O(2N)\) when compared to \(O(N^2)\) is like talkingabout a twostory house compared to a skyscraper.
Chapter 6  Optimizing for Optimistic Scenarios¶
[T]he worstcase scenario isn’t the only scenario worth considering.
Insertion Sort is a sorting algorithm that reveals the power of analyzingscenarios beyond the worstcase.
Insertion Sort involves: removals, comparisons, shifts, and insertions.
Big O Notation only takes into account the highest order of N when we have multipleorders added together.
This means that:
\(O(N^2 + 2N)\) = \(O(N^2)\)
\(O(N^4 + N^3 + N^2)\) = \(O(N^4)\)
This is because as \(N\) increases, the higher order values become so much larger than the lowerorder ones that it becomes trivial to account for them.
Note
Take a look at this chart to understand the severity with which these categories vary.
In the worstcase scenario, Selection Sort is faster than Insertion Sort. However, it is not the case when we look at the average case.
Best, Average and Worst cases follow a Bell Curve, showing that the worst case is, just like the best case, an outlier.
Best Case 
Average Case 
Worst Case 

Selection Sort 
\(O(N^2/2)\) 
\(O(N^2/2)\) 
\(O(N^2/2)\) 
Insertion Sort 
\(O(N)\) 
\(O(N^2/2)\) 
\(O(N^2)\) 
Being able to discern between best, average and worstcase scenarios is akey skill in choosing the best algorithm for your needs, as well as takingexisting algorithms and optimizing them further to make them significantlyfaster. [W]hile it’s good to be prepared for the worst case, average casesare what happen most of the time.
Chapter 7  Big O in Everyday Code¶
Note
I don’t have much to say about this chapter beyond that I really appreciate it being there. It nails the Big O Notation mindset into you, and so far, I’ve enjoyed it.
The problems solutions are fairly easy, and it’s a chapter you should read if you haven’t grokked Big O yet.
Tip
If an algorithm has 2 inputs of length \(N\) and \(M\) and it has a nested for loop that loops through them, then what would have been \(O(NxM)\) is reduced to \(O(N^2)\) because in the worstcase scenario, both these arrays would be the same size.
Chapter 8  Blazing Fast Lookup with Hash Tables¶
A Hash Table is a data structure in which the indices are hashed and correlated with values in an arraylike structure.
The example this book uses is alphabetbased hashing, using the product of the numberic value of each alphabet.
“dog” becomes \(4*15*7 = 420\), and the corresponding value associated with “dog” is stored in the 420 ^{th} location of an array that holds the values of the Hash Table.
This lookup mechanism is usually designed with the following assumptions
The hashing of a key always yields the same value. This way, there can be a 1:1 relationship between keys and their values.
The hashing is nonrandom, and doesn’t depend on execution time or machine. This enables the datastructure to be written to disk, if need be.
The efficiency of a hash table depends on three factors:
How much data we’re storing in the hash table.
How many cells are available in the hash table.
Which hash function we’re using.
A good hash function is one that distributes its data across allavailable cells. The more we can spread our data, the fewer collisionswe will have.
A good hash table strikes a balance of avoiding collisions while not consuming lots of memory.To accomplish this, computer scientists have developed the following rule of thumb:for every 7 data elements stored in the hash table, it should have 10 cells.
Hash tables have a lookup efficiency of \(O(N)\).
Hash tables are used in places where you’d usually use plenty of ifelseifelse
statements. Or, you’d use them when you want to store information about a single
entity together. This is mostvisible in the JSON
data format.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21  """Algorithm to detect whether 2 words are an Anagram or not"""
def is_anagram(word_1: str, word_2: str) > bool:
"""Checks whether 2 words is an anagram or not"""
word_1_dict = {}
for letter in word_1:
word_1_dict[letter] = word_1_dict.get(letter, 0) + 1
word_2_dict = {}
for letter in word_2:
word_2_dict[letter] = word_2_dict.get(letter, 0) + 1
return word_1_dict == word_2_dict
def test_anagram():
"""Checks the anagram function"""
assert is_anagram("mood", "doom")
assert is_anagram("rot", "tor")
assert is_anagram("something", "nothing") is False

Using a dictionary, Python’s implementation of a Hash Table, here, gives us an algorithm of \(O(N)\) order.
Exercises¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29  """Algorithm to determine the intersection of two arrays"""
def intersect_arrays(arr_1: list, arr_2: list) > list:
"""Returns the intersection of two arrays"""
from collections import OrderedDict
arr_1_hash = OrderedDict()
for item in arr_1:
arr_1_hash[item] = arr_1_hash.get(item, 0) + 1
arr_2_hash = OrderedDict()
for item in arr_2:
arr_2_hash[item] = arr_2_hash.get(item, 0) + 1
intersection = []
for key in arr_1_hash:
count_1 = arr_1_hash[key]
count_2 = arr_2_hash.get(key, 0)
# include items that are in arr_1, and arr_2
occurs_in_both = min(count_1, count_2)
intersection.extend([key]*occurs_in_both)
return intersection
def test_intersect_arrays():
"""Tests that the above algorithm works"""
assert intersect_arrays([1, 2, 3], [2]) == [2]
assert intersect_arrays([1, 1, 2, 3], [1, 2, 3]) == [1, 2, 3]
assert intersect_arrays([2, 1, 3], [1, 2, 2, 3]) == [2, 1, 3]

Todo
Do the other exercises.
Todo
Write an implementation of Hash Tables in Rust and Python.
Chapter 9  Crafting Elegant Code with Stacks and Queues¶
Stacks¶
A stack is almost an array, except:
Data can be inserted only at the end of a stack.
Data can be deleted only from the end of a stack.
Only the last element of a stack can be read.
The end of a stack is referred to as its top, and the beginning of the stack is referred to as its bottom.
Tip
In Python, a simple list can be used as a Stack, since lists
support pop
and push
methods. However, it is better to
write a wrapper around a regular list since this way you can
code checks in place so that users cannot insert an item wherever
they want.
Stacks are useful when dealing with the order of things, especially when we care about what was the last item to be put into an array. Looking one step behind is a constant pattern when dealing with stacks.
Chapter 10  Recursively Recurse with Recursion¶
Recursion has a lot to do with the stack. When a function calls itself, what is known as the Call Stack is assembled. This way, the output of one call is tied to the output of the other, and the eventual disassembly occurs when, the eventual last item returns something that is not tied to another call.
Consider the factorial function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16  """Program to find the factorial of a number"""
def factorial(n: int) > int:
"""Function to find the factorial of a number"""
if n <= 1:
return 1
else:
return n * factorial(n1)
def test_factorial():
import math
for number in range(500):
assert factorial(number) == math.factorial(
number), "Could not calculate the factorial of {}".format(number)

Here, the function factorial
will not immediately return anything.
For factorial(5)
, it first calls factorial(5)
, and before
it can return anything, it needs to invoke factorial(4)
.
For factorial(4)
, it first calls factorial(4)
, and before
it can return anything, it needs to invoke factorial(3)
.
For factorial(3)
, it first calls factorial(3)
, and before
it can return anything, it needs to invoke factorial(2)
.
For factorial(2)
, it first calls factorial(2)
, and before
it can return anything, it needs to invoke factorial(1)
.
For factorial(1)
, it will return 1.
Notice that until the program reaches factorial(1)
, it must
hold every instance of the function calls in memory. This creates
an untenable situation: the case where the machine cannot keep track of
this stack.
After each function call resolves, in the LIFO order, the call stack is popped, in that order, and the final result is returned to the user.
If the recursion call stack exceeds the short term meory of the computer, there is a Stack Overflow, and the program shuts down.
Exercises¶
1 2 3 4 5 6 7 8 9  """For an array containing both numbers and other arrays, write a recursive funciton that prints out all the numbers, and just the numbers"""
def print_nums(arr):
for item in arr:
if isinstance(item, int):
print(int)
else:
print_nums(item)

Chapter 11  Learning to Write in Recursive¶
Note
This chapter exists to drill down how you write in recursive. Do all the exercises.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16  """Recursive algorithm to find all directories within a given directory"""
def get_directories(path=None):
"""Returns a list of directories and subdirectories"""
import os
if path is None:
path = os.getcwd()
contents = os.listdir(path)
directories = []
for item in contents:
abs_path = os.path.join(path, item)
if os.path.isdir(abs_path):
directories.append(abs_path)
directories.extend(get_directories(abs_path))
return directories

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17  """Recursive function to double every item in an array"""
def double_array(array, index=0):
"""Doubles every item in an array"""
array[index] *= 2
index += 1
if index < len(array):
double_array(array, index)
def test_double_array():
import random
input_array = [random.random() for _ in range(random.randint(0, 1000))]
doubled_array = [x*2 for x in input_array]
double_array(input_array)
assert input_array == doubled_array, "Unable to double array"

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18  """Recursive function to get the nth factorial"""
def factorial(n):
"""Function to get the nth factorial"""
if n <= 1:
return 1
else:
return n*factorial(n1)
def test_factorial():
import random
import math
n = random.randint(1, 1000)
assert factorial(n) == math.factorial(
n), "unable to calculate the nth factorial"

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15  """Recursive solution to array sum"""
def array_sum(arr):
if len(arr) == 1:
return arr[0]
else:
return arr[0] + array_sum(arr[1:])
def test_array_sum():
import random
input_array = [random.randint(0, 1000)
for _ in range(random.randint(1, 1000))]
assert array_sum(input_array) == sum(input_array)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20  """Recursive solution to reverse a string"""
def reverse_string(input_string):
"""Reverse string"""
if len(input_string) > 1:
return input_string[1] + reverse_string(input_string[:1])
else:
return input_string[0]
def test_reverse_string():
sample_strings = [
"hello world",
"peter piper picked a pack of pickled peppers"
"the cake is a lie",
"once upon a time"
]
for st in sample_strings:
assert reverse_string(st) == "".join(reversed(st))

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23  """Recursive algorithm to count x in a string"""
def count_x(input_string):
"""Counts the occurrence of x in a string"""
if len(input_string) == 1:
return 1 if input_string[0] == "x" else 0
elif len(input_string) > 1:
return (1 if input_string[0] == "x" else 0) + count_x(input_string[1:])
else:
return 0
def test_count_x():
"""tests that the counting function works"""
input_strings = [
"x",
"",
"lmzxcvuljwsdf;laxlvjsdflweo asdflkxc;lvslweropupas xlx;lkvj;ljweorijxlj ;xlxx;lkj;lxx ",
"something that doesn't have it"
]
for string in input_strings:
assert count_x(string) == string.count("x")

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17  """Let’s say we have a staircase of N steps, and a person has the ability to climb
one, two, or three steps at a time. How many different possible “paths” can
someone take to reach the top? Write a function that will calculate this for N
steps.
"""
def number_of_paths(n):
"""Solves the staircase problem"""
if n < 0:
return 0
elif n in (0, 1):
return 1
else:
return (
number_of_paths(n1) + number_of_paths(n2) + number_of_paths(n3)
)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43  """Generate all anagrams of a set of letters"""
def anagrams_of(string: str):
"""Returns a list of anagrams of a string"""
if len(string) == 1:
# if the string is a single character, return a list containing that
# single character.
return [string[0]]
# instantiate an empty placeholder
collection = []
# Get all the anagrams of the rest of the string, besides the first
# character, until the end of the string.
substring_anagrams = anagrams_of(string[1:])
for substring_anagram in substring_anagrams:
# now, while looping through this new list of subanagrams,
for index in range(len(substring_anagram)+1):
# loop through each character in this anagram, insert the first
# character into a place dictated by the index of the subanagrams
anagram = substring_anagram[:index] + \
string[0] + substring_anagram[index:]
# append to our collection
collection.append(anagram)
# return the result
return collection
def test_anagrams_of():
"tests anagrams_of"
input_string = "abc"
result = [
"abc",
"bac",
"cab",
"cba",
"acb",
"bca"
]
anagrams = anagrams_of(input_string)
for item in result:
assert item in anagrams, "{} not found in anagrams of {}".format(
item, input_string)
