A Common Sense Guide to Data-Structures and Algorithms - Jay Wengrow¶
Note
This is one of the best beginner level books on the topic that I’ve encountered. In particular, the sections on Big O and Algorithmic performance come to mind, I don’t know that I’d recommend rereads of the book, but I would most certainly recommend anyone who is new to the topic to read this book end to end, with a pen in tow to make notes right in the book.
It’s that good.
Warning
This document is work in progress. Jump to current WIP marker
Chapter 1 - Why Data Structures Matter¶
The Array¶
An array is a contiguous dataset. While we might say, in Python:
1 2 | some_array = [10, 21, 2, 0, 8, 72, 8]
print(some_array[3])
|
Internally, the machine stores the length of the array, and the position of the first item.
Then, the computer can access the nth item by offsetting from the first. So to get the first item, you offset 0, to get the second, you offset 1.
This is why you get the 0-based indexing everywhere. Makes a lot of sense now that I think about it. I wonder how Visual Basic manages the 1-based index.
So for the computer, to access an item in a list, it is a single step. If you ask for the nth item, all the program needs to do is offset the memory position by n and return the value.
Note
If the array’s length < n, then you have a stack overflow. The program is trying to access memory that is not its own. It’s like trying to overdraw from your bank vault and reaching into the vault next to yours.
In a data structure, you perform the following operations:
Read
Search
Insert
Delete
When we want to measure the speed of these operations, this is where the entire idea of how a data structure is organized comes in. This is why some data structures are suitable for some usages, while some are not.
Note
This book’s section on data structures and why Array indexing is O(1) is quite good. Do give it a reread again.
For an array of size N:
Getting item at index n: 1 step
Setting item at index n: 1 step
Appending 1 item to the end: 1 step
Prepending 1 item to the beginning: N+1 steps
Deleting an item in the list: N steps (1 step is the best case scenario, when you delete the last item)
Tip
For algorithmic speed measurement, always be a pessimist. Consider only the worst case scenario.
The Set¶
A set is an array of unique values. There are no duplicates.
Note
This is a generalization, but we’ll get to that in time.
Most set operations are identical to arrays. However, insertion is where sets and arrays differ.
Appending or prepending and item into a set of N items will take N steps. That’s because a set
needs to fulfil its prime objective: no duplicates. To insert an item i
into a set N,
you will first need to search the entire set for the occurrence of i
, and only when it
doesn’t exist, do you insert it into the set.
Chapter 2 - Why Algorithms Matter¶
While data-structures have their own performance attributes, so do algorithms in general.
Ordered Arrays¶
An ordered array is almost identical to the classic array*, but the one caveat is that in an ordered array, the items are in order.
This has its benefits, but let’s look at its performance.
Read - 1 step
Insertion - N + 2 steps
Deletion - N steps
Insertion again takes N + 2 steps because the item needs to be inserted into the right location. You cannot just insert an item at the very end.
Searching Ordered Arrays¶
While searching normal arrays is a matter of looking at each and every item, this is not necessarily so with an ordered array. Instead, we can use Binary Search.
Note
All source is going to be placed in their respective topic folder.
Binary Search will halve its steps each time it compares values to the search value. So there will be \(log_2N\) steps involved. It narrows its results down into a binary answer, hence the name.
Tip
To understand what \(log_2N\) means, or what \(log_MN\) means, read The Algorithm Design Manual’s chapter on Logarithms.
Q: For an ordered array with 100 elements, binary search takes 7 steps. How many steps does it take for 200 elements?
A: 8 steps
Tip
Every time the number of elements doubles, the steps involved in Binary Search increases by 1.
Tip
The Logarithm of N to the base M shows how many times N can be divided by M.
\(log_2N\) shows how many times N can be halved.
\(log_{10}N\) shows how many times N can be divided into tenths.
Chapter 3 - O Yes! Big O Notation¶
Note
This book ignores most of the mathematical basis for Big O, and explains stuff rather simply. I recommend this section a lot. However, for a mathematical explanation, read The Algorithm Design Manual’s sections on Big O.
Big O measures how many steps an algorithm will take in relation to its input size.
If there are N data elements, how many steps will the algorithm take?
\(O(N)\) says that the answer to the key question is that the algorithm will take N steps.
Warning
If you have preconceived notions about Big O, please leave them at the door, for now.
Consider the following:
1 2 3 4 | def do_something(arr):
length = len(arr)
mid = arr[length // 2]
print(f"{length=}, {mid=}")
|
How many steps is does this code example take?
The answer is 3.
What is Big O notation?
\(O(1)\).
The soul of Big O is what Big O is truly concerned about: how will analgorithm’s performance change as the data increases?
Big O doesn’t care how many steps the algorithm actually takes. It instead cares about the rate of change of an algorithm given change in the input data.
\(O(1)\) and \(O(3)\) mean the same to Big O. This algorithm has no relationship with the input data. It will not get slower or faster given more or less data.
\(O(N)\) is a different matter, however. It means that the algorithm has a linear relationship with the data.
1 2 3 4 5 6 7 | def do_something(arr):
length = len(arr)
mid = arr[length // 2]
print(f"{length=}, {mid=}")
for element in arr:
print(f"{element=}")
print(f"{2*element=}")
|
Here, the algorithm seems to take \(O(2N+3)\) steps. \(2N+3\) is a linear polynomial. Wait, I promised no mathematics. It means that it is a statement that can be conceived as a straight line on a graph.
To Big O, the number next to \(N\) does not matter. Neither does the value 3 that is at the end.
In Big O terms, the above program has a runtime of \(O(N)\).
Big O is also a worst-case-scenario measure. For an algorithm, always consider the worst case scenario when denoting it in Big O terms. Linear search might take 1 step if your first item is the item you were looking for, but in an array of size N, if it’s the last item you’re looking for, you need N steps. Big O in this scenario is \(O(N)\), not \(O(1)\).
Big O concerns itself with Logarithms quite a lot. Logarithms are the opposite of exponents.
\(2^3\) is 8. Then, \(log_28\) is 3.
\(10*2\) is 100. Then, \(log_{10}100\) is 2.
An alternative way of thinking about logarithms is: how many times do we need to halve (or thirds, or tenths, depending on the :math:`log_NM` base N) a number to get 1?
\(O(log_2N)\), for brevity: \(O(logN)\), measures how many times we need to divide a dataset into halves to arrive at our result in the worst case scenario.
Chapter 4 - Speeding Up Your Code with Big O¶
Note
Read up on the implementation of Bubble Sort.
Bubble Sort has the following steps: comparisons, and swaps. Note how many times it loops through an array.
No, not twice. For each element, it loops through the entire array. In fact, it doesn’t know if an array is sorted until it loops through the array at least once.
Welcome to \(O(N^2)\). You do not want to be here.
Any polynomial with a power of 2 is a quadratic polynomial. On a graph, it would look like a parabola, but not one that ever touches the ground like someone throwing a ball. No, this is an upwards-facing parabola. That means if you have 1 million data points, your algorithm will need to process \(1000000^2\) items. A million squared.
Any algorithm wherein you loop through the entire list, or even portions of the list but perhaps have a worst-case scenario where you loop through the entire list is of this order. If you can avoid it, do not implement one of these.
\(O(N^3)\) exists. Just pray that you do not meet him or his cousins.
Chapter 5 - Optimizing Code With and Without Big O¶
Note
Read up on the implementation of Selection Sort.
In Selection Sort, you loop through the array to find the smallest item and swap it with the first item. This first item is, in fact, a cursor which moves ahead and marks the sorted part of the array.
For each item in the array, Selection Sort looks ahead at the rest of the items in the array, comparing them and seeing if there’s something smaller than this item. It then selects the smallest item, if it is not the current item, and swaps it with the current item.
This way, after the first passthrough, the smallest item is right at the front. The algorithm ensures that at the Nth passthrough, the first N items are in order at the front of the array.
So let’s look at Selection Sort and Bubble Sort together.
Selection Sort has to look at the rest of the items for each item in the list.
Bubble Sort has to look at the entire list for each item in the list.
While this seems straight forward, and no one will fault you for saying Selection Sort is faster (exactly half the runtime) than Bubble Sort, however, this does not matter to Big O.
But here’s the funny thing in the world of Big O Notation: Selection Sortand Bubble Sort are described in exactly the same way.
Tip
Remember to leave your knowledge or preconceived notions of Big O at the door. All you need to do is check this:
Does the algorithm have nested loops?
The exact time does not matter.
Why?
Big O Notation ignores constants.
It is easy to say Selection Sort is a \(O(N^2/2)\) algorithm. It would seem common sense.
However, consider the fact that the length of the array could grow astronomically. If you have an infinitely large array, then it does not matter.
\(O(N^2/2)\) is just \(O(N^2)\).
\(O(N/2+50000)\) is just \(O(N)\).
\(O(100000)\) is just \(O(1)\).
Big O Categories¶
Big O ignores the constants because it uses categories.
Big O Notation only concerns itself with general categories of algorithm speeds.
If we were to compare two buildings, one of which is a single-family home andone of which is a skyscraper, it becomes almost moot to mention how many floors each one has.Because the two buildings are so incredibly different in their sizes and functions.
This is the idea behind Big O. It categorizes algorithms into buckets wherein the bucket itself is a definitive measure of how quickly the algorithm grows with input data.
It won’t matter if your \(O(1)\) algorithm takes 10 minutes to run or if your \(O(N^2)\) takes 1 \({\mu}s\), at least not to Big O, because it doesn’t care how long each step takes, only how many steps there will be in relation to larger data set sizes.
Talking about \(O(2N)\) when compared to \(O(N^2)\) is like talkingabout a two-story house compared to a skyscraper.
Chapter 6 - Optimizing for Optimistic Scenarios¶
[T]he worst-case scenario isn’t the only scenario worth considering.
Insertion Sort is a sorting algorithm that reveals the power of analyzingscenarios beyond the worst-case.
Insertion Sort involves: removals, comparisons, shifts, and insertions.
Big O Notation only takes into account the highest order of N when we have multipleorders added together.
This means that:
\(O(N^2 + 2N)\) = \(O(N^2)\)
\(O(N^4 + N^3 + N^2)\) = \(O(N^4)\)
This is because as \(N\) increases, the higher order values become so much larger than the lower-order ones that it becomes trivial to account for them.
Note
Take a look at this chart to understand the severity with which these categories vary.
In the worst-case scenario, Selection Sort is faster than Insertion Sort. However, it is not the case when we look at the average case.
Best, Average and Worst cases follow a Bell Curve, showing that the worst case is, just like the best case, an outlier.
Best Case |
Average Case |
Worst Case |
|
Selection Sort |
\(O(N^2/2)\) |
\(O(N^2/2)\) |
\(O(N^2/2)\) |
Insertion Sort |
\(O(N)\) |
\(O(N^2/2)\) |
\(O(N^2)\) |
Being able to discern between best-, average- and worst-case scenarios is akey skill in choosing the best algorithm for your needs, as well as takingexisting algorithms and optimizing them further to make them significantlyfaster. [W]hile it’s good to be prepared for the worst case, average casesare what happen most of the time.
Chapter 7 - Big O in Everyday Code¶
Note
I don’t have much to say about this chapter beyond that I really appreciate it being there. It nails the Big O Notation mindset into you, and so far, I’ve enjoyed it.
The problems solutions are fairly easy, and it’s a chapter you should read if you haven’t grokked Big O yet.
Tip
If an algorithm has 2 inputs of length \(N\) and \(M\) and it has a nested for loop that loops through them, then what would have been \(O(NxM)\) is reduced to \(O(N^2)\) because in the worst-case scenario, both these arrays would be the same size.
Chapter 8 - Blazing Fast Lookup with Hash Tables¶
A Hash Table is a data structure in which the indices are hashed and correlated with values in an array-like structure.
The example this book uses is alphabet-based hashing, using the product of the numberic value of each alphabet.
“dog” becomes \(4*15*7 = 420\), and the corresponding value associated with “dog” is stored in the 420 th location of an array that holds the values of the Hash Table.
This lookup mechanism is usually designed with the following assumptions
The hashing of a key always yields the same value. This way, there can be a 1:1 relationship between keys and their values.
The hashing is non-random, and doesn’t depend on execution time or machine. This enables the datastructure to be written to disk, if need be.
The efficiency of a hash table depends on three factors:
How much data we’re storing in the hash table.
How many cells are available in the hash table.
Which hash function we’re using.
A good hash function is one that distributes its data across allavailable cells. The more we can spread our data, the fewer collisionswe will have.
A good hash table strikes a balance of avoiding collisions while not consuming lots of memory.To accomplish this, computer scientists have developed the following rule of thumb:for every 7 data elements stored in the hash table, it should have 10 cells.
Hash tables have a lookup efficiency of \(O(N)\).
Hash tables are used in places where you’d usually use plenty of if-else-if-else
statements. Or, you’d use them when you want to store information about a single
entity together. This is most-visible in the JSON
data format.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | """Algorithm to detect whether 2 words are an Anagram or not"""
def is_anagram(word_1: str, word_2: str) -> bool:
"""Checks whether 2 words is an anagram or not"""
word_1_dict = {}
for letter in word_1:
word_1_dict[letter] = word_1_dict.get(letter, 0) + 1
word_2_dict = {}
for letter in word_2:
word_2_dict[letter] = word_2_dict.get(letter, 0) + 1
return word_1_dict == word_2_dict
def test_anagram():
"""Checks the anagram function"""
assert is_anagram("mood", "doom")
assert is_anagram("rot", "tor")
assert is_anagram("something", "nothing") is False
|
Using a dictionary, Python’s implementation of a Hash Table, here, gives us an algorithm of \(O(N)\) order.
Exercises¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | """Algorithm to determine the intersection of two arrays"""
def intersect_arrays(arr_1: list, arr_2: list) -> list:
"""Returns the intersection of two arrays"""
from collections import OrderedDict
arr_1_hash = OrderedDict()
for item in arr_1:
arr_1_hash[item] = arr_1_hash.get(item, 0) + 1
arr_2_hash = OrderedDict()
for item in arr_2:
arr_2_hash[item] = arr_2_hash.get(item, 0) + 1
intersection = []
for key in arr_1_hash:
count_1 = arr_1_hash[key]
count_2 = arr_2_hash.get(key, 0)
# include items that are in arr_1, and arr_2
occurs_in_both = min(count_1, count_2)
intersection.extend([key]*occurs_in_both)
return intersection
def test_intersect_arrays():
"""Tests that the above algorithm works"""
assert intersect_arrays([1, 2, 3], [2]) == [2]
assert intersect_arrays([1, 1, 2, 3], [1, 2, 3]) == [1, 2, 3]
assert intersect_arrays([2, 1, 3], [1, 2, 2, 3]) == [2, 1, 3]
|
Todo
Do the other exercises.
Todo
Write an implementation of Hash Tables in Rust and Python.
Chapter 9 - Crafting Elegant Code with Stacks and Queues¶
Stacks¶
A stack is almost an array, except:
Data can be inserted only at the end of a stack.
Data can be deleted only from the end of a stack.
Only the last element of a stack can be read.
The end of a stack is referred to as its top, and the beginning of the stack is referred to as its bottom.
Tip
In Python, a simple list can be used as a Stack, since lists
support pop
and push
methods. However, it is better to
write a wrapper around a regular list since this way you can
code checks in place so that users cannot insert an item wherever
they want.
Stacks are useful when dealing with the order of things, especially when we care about what was the last item to be put into an array. Looking one step behind is a constant pattern when dealing with stacks.
Chapter 10 - Recursively Recurse with Recursion¶
Recursion has a lot to do with the stack. When a function calls itself, what is known as the Call Stack is assembled. This way, the output of one call is tied to the output of the other, and the eventual disassembly occurs when, the eventual last item returns something that is not tied to another call.
Consider the factorial function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | """Program to find the factorial of a number"""
def factorial(n: int) -> int:
"""Function to find the factorial of a number"""
if n <= 1:
return 1
else:
return n * factorial(n-1)
def test_factorial():
"""Test for the factorial function"""
import math
for number in range(500):
assert factorial(number) == math.factorial(
number), "Could not calculate the factorial of {}".format(number)
|
Here, the function factorial
will not immediately return anything.
For factorial(5)
, it first calls factorial(5)
, and before
it can return anything, it needs to invoke factorial(4)
.
For factorial(4)
, it first calls factorial(4)
, and before
it can return anything, it needs to invoke factorial(3)
.
For factorial(3)
, it first calls factorial(3)
, and before
it can return anything, it needs to invoke factorial(2)
.
For factorial(2)
, it first calls factorial(2)
, and before
it can return anything, it needs to invoke factorial(1)
.
For factorial(1)
, it will return 1.
Notice that until the program reaches factorial(1)
, it must
hold every instance of the function calls in memory. This creates
an untenable situation: the case where the machine cannot keep track of
this stack.
After each function call resolves, in the LIFO order, the call stack is popped, in that order, and the final result is returned to the user.
If the recursion call stack exceeds the short term meory of the computer, there is a Stack Overflow, and the program shuts down.
Exercises¶
1 2 3 4 5 6 7 8 9 | """For an array containing both numbers and other arrays, write a recursive funciton that prints out all the numbers, and just the numbers"""
def print_nums(arr):
for item in arr:
if isinstance(item, int):
print(int)
else:
print_nums(item)
|
Chapter 11 - Learning to Write in Recursive¶
Note
This chapter exists to drill down how you write in recursive. Do all the exercises.
Recursion Categories¶
While recursive algorithms may seem like they’re of one variety, there are multiple ways they may be used.
Repeatedly Execute¶
One way of looking at recursion is to envision it as a repeated execution. You have a function that just needs to occur multiple times for the problem to be solved.
A countdown is such an algorithm.
1 2 3 4 5 6 | def countdown(number) {
"""Recursive implementation of a countdown function"""
if number < 0:
return
print(number)
countdown(number - 1)
|
In the above code-block, countdown
just calls itself, decrementing the input.
As with all recursion, it has a base case, where number < 0
, and quits
there.
Another example of this is a function to recursively find directories in a given directory.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | """Recursive algorithm to find all directories within a given directory"""
def get_directories(path=None):
"""Returns a list of directories and subdirectories"""
import os
if path is None:
path = os.getcwd()
contents = os.listdir(path)
directories = []
for item in contents:
abs_path = os.path.join(path, item)
if os.path.isdir(abs_path):
directories.append(abs_path)
directories.extend(get_directories(abs_path))
return directories
|
Passing Extra Parameters¶
Tip
Note that you cannot pass around a modifiable slice in Python. Slices are merely copies, and not references to the original object.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | """Recursive function to double every item in an array"""
def double_array(array, index=0):
"""Doubles every item in an array"""
array[index] *= 2
index += 1
if index < len(array):
double_array(array, index)
def test_double_array():
import random
input_array = [random.random() for _ in range(random.randint(0, 1000))]
doubled_array = [x*2 for x in input_array]
double_array(input_array)
assert input_array == doubled_array, "Unable to double array"
|
Calculation of Subproblem¶
In some recursive functions, the task is to calculate something based on a subtask. That is to say, recursion can be used to make a calculation based on a sub-problem of a problem.
The factorial of a number is one such problem.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | """Recursive function to get the nth factorial"""
def factorial(n):
"""Function to get the nth factorial"""
if n <= 1:
return 1
else:
return n*factorial(n-1)
def test_factorial():
import random
import math
n = random.randint(1, 1000)
assert factorial(n) == math.factorial(
n), "unable to calculate the nth factorial"
|
Given the definition of a factorial of an integer as the product of all the integers up to and including the given integer, we can write the above program.
Tip
I know it might seem trivial and obvious, but the factorial and the fibonacci numbers are not the same. Do not look at the word factorial and immediately write code for the fibonacci sequence.
To explain this recursive algorithm, note that at each step, a value is returned, with a recursive call. This builds upon recursion’s use of a call stack, and once the base case is reached, the entire stack disassembles, returning the result.
Top-Down Recursion¶
When solving something top down, you will have to push back the actual solution to much later down the stack. Reconsider the factorial problem. You don’t need to understand how the factorial function works to use it.
Consider the Array Sum problem.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | """Recursive solution to array sum"""
def array_sum(arr):
if len(arr) == 1:
return arr[0]
else:
return arr[0] + array_sum(arr[1:])
def test_array_sum():
import random
input_array = [random.randint(0, 1000)
for _ in range(random.randint(1, 1000))]
assert array_sum(input_array) == sum(input_array)
|
In the above approach, we take the liberty of assuming that the array_sum
function’s
implementation is irrelevant. In fact, we continue to use it until we reach the last item.
It is the base case that finally says “I do not need to use this function again”, and returns
a value without adding to the call stack.
Another example of this is the String Reversal problem, one that you will also see in EASY 0344. Reverse String.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | """Recursive solution to reverse a string"""
def reverse_string(input_string):
"""Reverse string"""
if len(input_string) > 1:
return input_string[-1] + reverse_string(input_string[:-1])
else:
return input_string[0]
def test_reverse_string():
sample_strings = [
"hello world",
"peter piper picked a pack of pickled peppers"
"the cake is a lie",
"once upon a time"
]
for st in sample_strings:
assert reverse_string(st) == "".join(reversed(st))
|
The next example deals with counting the occurrence of x
in a string.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | """Recursive algorithm to count x in a string"""
def count_x(input_string):
"""Counts the occurrence of x in a string"""
if len(input_string) == 1:
return 1 if input_string[0] == "x" else 0
elif len(input_string) > 1:
return (1 if input_string[0] == "x" else 0) + count_x(input_string[1:])
else:
return 0
def test_count_x():
"""tests that the counting function works"""
input_strings = [
"x",
"",
"lmzxcvuljwsdf;laxlvjsdflweo asdflkxc;lvslweropupas xlx;lkvj;ljweorijxlj ;xlxx;lkj;lxx ",
"something that doesn't have it"
]
for string in input_strings:
assert count_x(string) == string.count("x")
|
The Staircase Problem¶
Consider the scenario where you have a staircase. Many people have different ways of ascending or descending a staircase. Some people skip a few steps. The end result is the same, but the route taken can be one of many.
If the allowed number of steps are either 1, 2, or 3, the following solution holds water:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | """Let’s say we have a staircase of N steps, and a person has the ability to climb
one, two, or three steps at a time. How many different possible “paths” can
someone take to reach the top? Write a function that will calculate this for N
steps.
"""
def number_of_paths(n):
"""Solves the staircase problem"""
if n < 0:
return 0
elif n in (0, 1):
return 1
else:
return (
number_of_paths(n-1) + number_of_paths(n-2) + number_of_paths(n-3)
)
|
Note
This solution hard-codes the number of stairs one can climb at once.
This problem is a trivial study of a map traversal problem, and it requires studying if one is to eventually solve the travelling salesman problem.
Anagram Generation¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 | """Generate all anagrams of a set of letters"""
def anagrams_of(string: str):
"""Returns a list of anagrams of a string"""
if len(string) == 1:
# if the string is a single character, return a list containing that
# single character.
return [string[0]]
# instantiate an empty placeholder
collection = []
# Get all the anagrams of the rest of the string, besides the first
# character, until the end of the string.
substring_anagrams = anagrams_of(string[1:])
for substring_anagram in substring_anagrams:
# now, while looping through this new list of sub-anagrams,
for index in range(len(substring_anagram)+1):
# loop through each character in this anagram, insert the first
# character into a place dictated by the index of the sub-anagrams
anagram = substring_anagram[:index] + \
string[0] + substring_anagram[index:]
# append to our collection
collection.append(anagram)
# return the result
return collection
def test_anagrams_of():
"tests anagrams_of"
input_string = "abc"
result = [
"abc",
"bac",
"cab",
"cba",
"acb",
"bca"
]
anagrams = anagrams_of(input_string)
for item in result:
assert item in anagrams, "{} not found in anagrams of {}".format(
item, input_string)
|
Note
Anagram Generation is not an efficient algorithm At its worst, it takes \(O(N!)\) time.
Exercises¶
1 2 3 4 5 6 7 8 9 10 11 12 | """Use recursion to write a function that accepts an array of strings and
returns the total number of characters across all the strings.
"""
def charlen(input_array):
"""Takes an array of strings and returns the total number of characters"""
if len(input_array) == 1:
return len(input_array[0])
else:
return len(input_array[0]) + charlen(input_array[1:])
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | """Use recursion to write a function that accepts an array of numbers and returns a new
array containing just the even numbers"""
def get_even_numbers(input_array):
"""Return a list of even numbers in the given input array"""
result = []
if len(input_array) == 1:
if input_array[0] % 2 == 0:
return [input_array[0]]
else:
return []
else:
if input_array[0] % 2 == 0:
result.append(input_array[0])
result.extend(get_even_numbers(input_array[1:]))
return result
def test_get_even_numbers():
"""Test get_even_numbers"""
import random
random_list = [random.randint(0, 1000)
for _ in range(random.randint(100, 1000))]
evens = [x for x in random_list if (x % 2 == 0)]
print(len(random_list))
assert get_even_numbers(random_list) == evens, "Unable to get even numbers"
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | """There is a numerical sequence known as Triangular Numbers. The pattern
begins as `1, 3, 5, 10, 15, 21`, and continues onward with the Nth number in
the pattern, which is N plus the previous number. For example, the 7th number
in the sequence is 28, since it's 7 (which is N) plus 21 (the previous number
in the sequence). Write a function that accepts the number for N and
returns the correct number from the series. That is, if the function was passed
the number 7, the function would return 28."""
import pytest
def triangular_number(n):
"""Return the nth triangular number"""
defined_values = [1, 3, 5, 10, 15, 21]
if 1 <= n <= 6:
return defined_values[n-1]
else:
return n + triangular_number(n-1)
@pytest.mark.parametrize("test_input, expected", [
(1, 1),
(2, 3),
(3, 5),
(4, 10),
(5, 15),
(6, 21),
(7, 28),
(8, 36),
(9, 45)
])
def test_triangular_number(test_input, expected):
"""Test triangular numbers"""
assert triangular_number(test_input) == expected, (
f"Triangular number value for {test_input} should have been {expected}"
)
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | """Use recursion to write a function that accepts a string and returns the
first index that contains the character `x`."""
def find_x(input_string, current_index=0):
if input_string[0] == "x":
return current_index
else:
current_index = current_index + 1
return find_x(input_string[1:], current_index)
def test_find_x():
import random
import string
random_letters = [
random.choice(string.ascii_letters) for _ in range(random.randint(10, 1000))
]
random_letters += ["x"]
random.shuffle(random_letters)
random_string = "".join(random_letters)
assert find_x(random_string) == random_string.find(
'x'), "Unable to get the correct index"
|
Unique Paths Problem¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | """Let's say you have a grid of rows and columns. Write a function that accepts
a number of rows and a number of columns, and calculates the number of possible
"shortest" paths from the upper-leftmost square to the lower-rightmost square.
Given a square position (i,j), your next step is one of: (i+1, j), (i-1, j), (i, j+1) or (i, j-1).
"""
def number_of_shortest_paths(rows, columns):
"""Unique Paths Problem"""
if rows == columns == 2:
return 2
elif (rows == 2 and columns == 3) or (rows == 3 and columns == 2):
return 3
elif (rows == 3 and columns == 3):
return 6
|
Chapter 12 - Dynamic Programming¶
Note
This chapter oddly deals mostly with memoization-based dynamic programming. I need to look into more diverse methods of dynamic programming.