13.1 — Sorting & Misc. topics

Binary search algorithm

We have seen examples which perform linear search using a loop:

1def linear_search(sequence, target):
2 for index in range(len(sequence)):
3 if sequence[index] == target:
4 return index
5
6 return -1 # Not found

  • Binary search is a faster algorithm to search an item in a sequence, provided the sequence is sorted.
  • Binary search is similar to looking up a word in an English dictionary. Suppose we are looking for the word “doctor”
    • We flip pages in the dictionary to find the “d” section but we may end up a little further to the right, say at “f” section.
    • Then we flip pages to the left and may end up at “da” section.
    • Then we flip pages to the right towards “do” and so on…
    • At each step, we decrease the number of pages to search.
    • The process works because the dictionary is sorted in alphabetical order.

Visualize the binary search algorithm

Implementation

1def binary_search(sequence, target):
2 low = 0
3 high = len(sequence) - 1
4
5 while low <= high:
6 middle = (low + high) // 2 # floor division
7
8 if sequence[middle] < target:
9 low = middle + 1
10 elif sequence[middle] > target:
11 high = middle -1
12 else:
13 return middle
14
15 return -1 # Not found

  • In general, if the length of sequence is NN
    • Linear search takes time proportional to NN
    • Binary search takes time proportional to log(N)log(N)

Sorting algorithms

  • Sorting algorithms sort a sequence into ascending or descending order.

    [1, 3, 2, 0] → [0, 1, 2, 3] [‘a’, ‘c’, ‘b’, ‘d’] → [‘a’, ‘b’, ‘c’, ‘d’]

  • There are many sorting algorithms which have different speed and computer memory requirements.

  • We will only cover two — Selection sort and Insertion sort

Selection sort algorithm and implementation

Visualize algorithm: https://visualgo.net/en/sorting

1def selection_sort(seq):
2 N = len(seq)
3
4 for i in range(N):
5 # Assume that element at index i is minimum
6 min_index = i
7
8 # Find minimum of unsorted elements on right of i
9 for k in range(i+1, N):
10 if seq[k] <= seq[min_index]:
11 min_index = k
12
13 # Swap elements at i and min_index
14 temp = seq[i]
15 seq[i] = seq[min_index]
16 seq[min_index] = temp

Run selectionsort.py to see the sequence after each iteration.

Insertion sort algorithm and implementation

Visualize algorithm: https://visualgo.net/en/sorting

First let’s look at a single step when we have a partially sorted list of size kk and want to move an element to have a bigger partially sorted list of size k+1k+1.

1sequence = [9, 22, 51, 63, 10, 79, 60, 75] # partially sorted list
2
3i = 4
4key = sequence[i] # 10
5
6# Elements on the left of key are sorted.
7# We want to insert key on the left to keep the partial list sorted.
8# Shift elements one place to the right if they are greater than key.
9j = i - 1
10while(j >= 0 and sequence[j] > key):
11 sequence[j+1] = sequence[j]
12 j = j - 1
13
14# Moving key to index j+1
15sequence[j+1] = key
16print(sequence) # [9, 10, 22, 51, 63, 79, 60, 75]

Now, we can look at the final insertion sort implementation:

1def insertion_sort(seq):
2 N = len(seq)
3
4 for i in range(1, N):
5 key = seq[i]
6
7 # Elements on the left of key are already sorted.
8 # Shift them one place to the right if they are greater than key.
9 j = i - 1
10 while(j >= 0 and seq[j] > key):
11 seq[j+1] = seq[j]
12 j = j - 1
13
14 # After shifting right, index j+1 is now available for key
15 seq[j+1] = key

Run insertionsort.py to see the sequence after each iteration.

Iterables

https://docs.python.org/3/glossary.html#term-iterable

An object capable of returning its members one at a time.

  • all sequence types (such as list, str, and tuple)
  • some non-sequence types like dict, file objects
  • objects of any class that implements __iter__() or __getitem__() method.

Iterables can be used in a for loop and in many other places where a sequence is needed (zip(), map(), etc.).

  • When an iterable object is passed as an argument to the built-in function iter(), it returns an iterator for the object.
  • This iterator is good for one pass over the set of values.
  • Repeated calls to the iterator’s __next__() method (or passing it to the built-in function next()) return successive items in the stream.
  • When no more data are available a StopIteration exception is raised instead by __next__() method.

1>>> rng = range(1, 6, 2)
2>>> it = iter(rng) # create iterator
3>>> next(it)
41
5>>> next(it)
63
7>>> next(it)
85
9>>> next(it)
10Traceback (most recent call last):
11 File "<stdin>", line 1, in <module>
12StopIteration

  • When using iterables, it is usually not necessary to call iter() or deal with iterator objects ourselves.
  • The for statement does that automatically for us, creating a temporary variable to hold the iterator for the duration of the loop.
1for x in some_iterable:
2 do_something(x)
1it = iter(some_iterable)
2while True:
3 try:
4 x = next(it)
5 do_something(x)
6 except StopIteration:
7 break

Our simple range implementation

1class MyRange:
2 # assume step will be positive
3 def __init__(self, start, stop, step):
4 self.start = start
5 self.stop = stop
6 self.step = step
7 self.value = start
8
9 def nextValue(self):
10 value = self.value
11 if value < self.stop:
12 self.value += self.step
13 return value
1rng = MyRange(1, 8, 2)
2while True:
3 val = rng.nextValue()
4 if not val:
5 break
6
7 print(val)

1class MyRange:
2 # assume step will be positive
3 def __init__(self, start, stop, step):
4 self.start = start
5 self.stop = stop
6 self.step = step
7 self.value = start
8
9 def nextValue(self):
10 value = self.value
11 if value < self.stop:
12 self.value += self.step
13 return value
14 raise StopIteration()
1rng = MyRange(1, 8, 2)
2while True:
3 try:
4 val = rng.nextValue()
5 print(val)
6 except StopIteration:
7 break

1class MyRange:
2 # assume step will be positive
3 def __init__(self, start, stop, step):
4 self.start = start
5 self.stop = stop
6 self.step = step
7 self.value = start
8
9 def __iter__(self):
10 return self
11
12 def __next__(self):
13 value = self.value
14 if value < self.stop:
15 self.value += self.step
16 return value
17 raise StopIteration()
1rng = MyRange(1, 8, 2)
2for val in rng:
3 print(val)

1import random
2
3class RandomIter:
4 def __init__(self, num):
5 self.num = num
6 self.index = 0
7
8 def __iter__(self):
9 return self
10
11 def __next__(self):
12 if self.index < self.num:
13 self.index += 1
14 return random.random()
15 raise StopIteration()
16
17
18for val in RandomIter(4):
19 print(val)

Generators using yield statement

1def myrange(start, stop, step):
2 value = start
3
4 while True:
5 if value >= stop:
6 break
7
8 yield value
9 value += step
10
11
12for x in myrange(1, 8, 2):
13 print(x)

1import random
2
3
4def random_iter(num):
5 for i in range(num):
6 yield random.random()
7
8
9for x in random_iter(5):
10 print(x)
11
12
13print(list(random_iter(4)))

Functions are first-class citizens

Functions can be used in same way as any other Python objects:

  • Functions can be passed function as arguments
  • Functions can be created inside other functions
  • A function can be returned from other function.

Useful examples of passing functions as arguments

1points = [(1, 1, 3), (4, 10, 9), (7, 4, 11)]
2
3
4def key_func(p):
5 return p[1], p[0], p[2]
6
7
8print(max(points, key=key_func))
9
10print(sorted(points, key=key_func))

lambda

An anonymous inline function consisting of a single expression which is evaluated when the function is called.

The syntax to create a lambda function is

1f = lambda param1, param2, ...: some_expression

Equivalent to

1def f(param1, param2, ...):
2 return some_expression

1diagnostic_frequencies = {'pharyngitis': 1,
2 'meningitis': 1,
3 'food_poisoning': 2}
4
5maxKey = max(diagnostic_frequencies,
6 key=lambda k: diagnostic_frequencies[k])
7print(maxKey)
8
9maxKey = max(diagnostic_frequencies,
10 key=diagnostic_frequencies.get)
11print(maxKey)

1words = [("happy", 0.7), ("sad", -0.9),
2 ("fun", 0.58), ("enemy", -0.4)]
3
4words_sorted = sorted(words, key=lambda tup: tup[1])
5print(words_sorted)

Built-in map function

map(func, iterable) applies a function func to every item of an iterable, yielding the results.

In very simplified terms, map() does something like:

1def map(func, iterable):
2 for item in iterable:
3 yield func(item)

Examples:

1x = ["23", "3.14", "1.61"]
2y = list(map(float, x))
3print(y)

1inventory = {"sofa": 10, "chair": 5, "lamp": 3}
2new_inventory = dict(map(lambda item: (item[0].upper(), item[1]+10),
3 inventory.items()))
4print(new_inventory)
1persons = [['music', 'running', 'reading'],
2 ['movies', 'boardgames'],
3 ['boardgames', 'running', 'hiking']]
4
5newlist = list(map(set, persons))
6print(newlist)

Other built-in functions that work with iterables

filter(func, iterable): yields items from iterable for which func(item) is True. Equivalent to:

1def filter(func, iterable):
2 for item in iterable:
3 if func(item):
4 yield item
1nums = [-2, 10, -5, 7]
2positive = list(filter(lambda x: x > 0, nums))
3print(positive) # [10, 7]

all(iterable): Return True if all items of the iterable are true (or if the iterable is empty). Equivalent to:

1def all(iterable):
2 for item in iterable:
3 if not item:
4 return False
5 return True
1nums = [2, 4, 6, 8]
2print(all([x % 2 == 0 for x in nums])) # True
3
4nums = [1, 4, 6, 8]
5print(all([x % 2 == 0 for x in nums])) # False

any(iterable): Return True if any item of the iterable is true. If the iterable is empty, return False. Equivalent to:

1def any(iterable):
2 for item in iterable:
3 if item:
4 return True
5 return False
1nums = [2, 3, 5, 7]
2print(any([x % 2 == 0 for x in nums])) # True
3
4nums = [1, 3, 5, 7]
5print(any([x % 2 == 0 for x in nums])) # False

Inner/Nested functions

  • Functions can be defined inside other functions.
  • This is useful to create helper functions that are not used outside of a function — Encapsulates (hides) inner functions.
1def main_function():
2 def helper(params):
3 print("do something", params)
4
5 helper("hello")
6 helper(123)
7
8
9main_function()
10helper()

Closure

An enclosed (i.e. inner/nested) function has access to local variables and parameters of outer (enclosing) function.

1def create_quadratic(a, b, c):
2 def quadratic(x):
3 return a * x ** 2 + b * x + c
4
5 return quadratic
6
7# q1 and q2 are functions:
8q1 = create_quadratic(1, 3, 10)
9q2 = create_quadratic(-5, -1, 0)
1import numpy as np
2import matplotlib.pyplot as plt
3
4x = np.linspace(-20, 20, 1000)
5plt.plot(x, q1(x), "r")
6plt.plot(x, q2(x), "b")
7plt.show()

Last lecture next week

  • Examples using SciPy
  • Final exam overview
  • Practice problems from past exams