Complexity

Complexity

  • Computational Complexity
  • Complexity in Software Design

Computational Complexity

Investigation of efficiency: Amount of resources needed for algorithms working on data structures

Dimensions of Complexity

Time Complexity

  • Time to run an algorithm with given data size
  • Often main focus of complexity analysis

Space Complexity

  • Memory required to run an algorithm with given data size

Quiz

How many bytes are in a double temparature array
on a 0.1° x 0.1° lat-lon grid and 100 vertical model levels,
containing hourly data for one day?

Quiz

How many bytes are in a double temparature array
of 3600 x 1800 x 100 x 24 elements?

12441600000 or 12 GB

(± overhead & compression)

Big-\(\mathcal{O}\) Notation

  • growth in relation to input data size \(n\)
  • big-\(\mathcal{O}\) notation describes asymptotic behaviour

Big-\(\mathcal{O}\) Notation

Order Notation
constant \(\mathcal{O}(1)\)
logarithmic \(\mathcal{O}(\log{n})\)
linear \(\mathcal{O}(n)\)
quasilinear \(\mathcal{O}(n \log{n})\)
quadratic \(\mathcal{O}(n^2)\)

Quiz

How large is \(\log{n}\) on a computer?

Quiz

How large is \(\log{n}\) on a computer?

64

Quiz

How large is \(\log{n}\) on a computer?

64 (at least on a 64 bit machine)

Example

sorting

Bubble Sort (1/2)

  1. Array with \(n\) elements
  1. Compare first two elements, swap them if left > right
  1. Continue pairwise comparison until last elements
  1. Keep repeating until everything is sorted

Bubble Sort (2/2)

def swap_if_larger(values):
    for i in range(len(values) - 1):
        if values[i] > values[i+1]:
            values[i], values[i+1] = values[i+1], values[i]
            print(values)
test_values = [6, 0, 3, 5, 1]
swap_if_larger(test_values)
[0, 6, 3, 5, 1]
[0, 3, 6, 5, 1]
[0, 3, 5, 6, 1]
[0, 3, 5, 1, 6]
def bubble_sort(values):
    for _ in values:
        swap_if_larger(values)

test_values = [6, 0, 3, 5, 1]
bubble_sort(test_values)
[0, 6, 3, 5, 1]
[0, 3, 6, 5, 1]
[0, 3, 5, 6, 1]
[0, 3, 5, 1, 6]
[0, 3, 1, 5, 6]
[0, 1, 3, 5, 6]

Quick Sort (1/2)

CC-BY-SA idea instructions

Quick Sort (2/2)

def quicksort(l):
    if len(l) == 0: return []
    pivot = l[0]
    lower = [e for e in l[1:] if e < pivot]
    upper = [e for e in l[1:] if e >= pivot]
    return quicksort(lower) + [pivot] + quicksort(upper)

test_values = [6, 0, 3, 1, 5, 7, 4, 2]
quicksort(test_values)
[0, 1, 2, 3, 4, 5, 6, 7]

Bubble vs Quick

There are more

You may want to study e.g.:

  • Heapsort
  • Mergesort
  • Countingsort
  • Bogosort

For an overview, check Wikipedia

Example

Fibonacci

Fibonacci series

\[ f(0) = 0 \\ f(1) = 1 \\ f(n) = f(n-2) + f(n-1) \]

Fibonacci series example (1/3)

Direct implementation

def fib1(n):
    if n < 2: return n
    return fib1(n - 2) + fib1(n - 1)

[fib1(i) for i in range(10)]
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
%%time
fib1(30)
CPU times: user 144 ms, sys: 0 ns, total: 144 ms
Wall time: 146 ms
832040

Fibonacci series example (2/3)

Iterative implementation with memory

def fib2(n):
    if n < 2: return n
    a = 0; b = 1
    for i in range(2, n + 1): a, b = b, a + b
    return b

[fib2(i) for i in range(10)]
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
%%time
fib2(30)
CPU times: user 6 μs, sys: 1 μs, total: 7 μs
Wall time: 10.7 μs
832040

👉 Substantially faster

Performance Overview

Fibonacci series example (3/3)

\[ \begin{pmatrix}f_{n}\\f_{n+1}\end{pmatrix} = \begin{pmatrix}0 & 1\\1 & 1\end{pmatrix} \begin{pmatrix}f_{n-1}\\f_{n}\end{pmatrix} \]

\[ \begin{pmatrix}f_{n}\\f_{n+1}\end{pmatrix} = \begin{pmatrix}0 & 1\\1 & 1\end{pmatrix}^{n} \begin{pmatrix}f_0\\f_1\end{pmatrix} \]

vector

from dataclasses import dataclass

@texV  # nicer display
@dataclass
class V2:
    x: float; y: float
    def __add__(s, o):  # self, other
        return V2(s.x + o.x, s.y + o.y)

V2(1, 2)

\[ \begin{pmatrix} 1\\ 2 \end{pmatrix} \]

vector

V2(1, 2) + V2(20, 40)

\[ \begin{pmatrix} 21\\ 42 \end{pmatrix} \]

matrix

\[ \begin{pmatrix}a & b \\ c & d\end{pmatrix} \begin{pmatrix}x\\y\end{pmatrix} = \begin{pmatrix}a x + b y\\c x + d y\end{pmatrix} \]

@texM  # nicer display
@dataclass
class M2:
    a: float; b: float; c: float; d: float

    def __mul__(s, o):  # self, other
        if isinstance(o, V2):
            return V2(s.a * o.x + s.b * o.y, s.c * o.x + s.d * o.y)
        elif isinstance(o, M2):
            return M2(s.a * o.a + s.b * o.c, s.a * o.b + s.b * o.d,
                      s.c * o.a + s.d * o.c, s.c * o.b + s.d * o.d)

test it:

m = M2(0, 1, 1, 1)
m

\[ \begin{pmatrix} 0 & 1\\ 1 & 1 \end{pmatrix} \]

m * V2(0, 1)

\[ \begin{pmatrix} 1\\ 1 \end{pmatrix} \]

m * m * V2(0, 1)

\[ \begin{pmatrix} 1\\ 2 \end{pmatrix} \]

m * m * m * V2(0, 1)

\[ \begin{pmatrix} 2\\ 3 \end{pmatrix} \]

power function

power(2, 4)
16

… applied on matrix class:

power(m, 3) * V2(0, 1)

\[ \begin{pmatrix} 2\\ 3 \end{pmatrix} \]

Hands-on!

  • Implement that generic power function.
  • Do so with a time complexity better than \(\mathcal{O}(n)\)

fib3

def fib3(n):
    if n < 2: return n
    return (power(m, n) * V2(0, 1)).x

[fib3(i) for i in range(10)]
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
%%time
fib3(30)
CPU times: user 21 μs, sys: 2 μs, total: 23 μs
Wall time: 26 μs
832040

Performance Overview

👉 Unfortunately slower 😬

Performance Overview

👉 Faster for large numbers

Example

hands-on

Hands-on example

\(f(x) = -0.9 x^3 - 1.7 x + 0.1 \sin\left(12x\right) + 1.4\)

Hands-on Session!

  • Implement a bisection algorithm
  • Use it to find the root of f (i.e. x such that f(x) == 0)
  • How often did you have to evaluate f to obtain a result?
  • How does this depend on desired accuracy?

git bisect

“didn’t this work before?”

git bisect start E A

Takeaway

different algorithms for the same problem

Often there are different algorithms solving the same problem. These can come with very different complexity ratings.

Big-\(\mathcal{O}\)

Helps talking about complexity, informs about large problems.

scalar factors can matter

For small problems, scalar factors may be more important than Big-\(\mathcal{O}\).

(but the problem is small anyways)

math can be better

never use algorithms if math can do

… and prefer better algorithms if math can’t do

human time matters

Complexity Classes

\(\mathcal{O}(1)\)

very nice

  • basic arithmetic
  • array element access
  • linked list insertion or removal

\(\mathcal{O}(\log{n})\)

ok for single element operations

  • tree / dict access, insertion or deletion
  • nearest neighbor search
  • bisect

\(\mathcal{O}(n)\)

nice if you have to touch many things

  • maps, reductions etc… (e.g. sum of an array)
  • linked list element access
  • array search

\(\mathcal{O}(n \log{n})\)

ok if you have to touch many things

  • sorting
  • k-d tree construction

\(\mathcal{O}(n^2)\)

not so nice

  • matrix multiplication
  • many algorithms by accident

what would you expect?

Think about what complexity class is expected.
If your program behaves differently, change it or seek advice from colleagues.

Complexity in Software Design

“Complexity kills. It sucks the life out of developers, it makes products difficult to plan, build, and test”

Ray Ozzie, former Microsoft CTO

Interdependence of Components

Here complexity refers to

  • amount of interactions between objects
  • resulting amount of coordination and communication needed

Complexity is not about complicated designs (although simple designs are usually a good idea)

Typical Ways to Counteract

  • Use abstraction when suitable
  • Encapsulate different concepts in different objects
  • Refactor existing code when necessary

Maintainability

If code works, why put more effort in it to reduce complexity and make it more approachable?

  • Your future self and others can understand the code faster,
  • Changing and adding components should be easy and only require local adjustments,
  • Bugs are easier to find when the complexity is low (and components individually testable),
  • This also makes testing easier and more likely to test all code paths (high code coverage)

Resources

Computational Complexity

  • “Introduction to Algorithms” by T. Cormen, C. Leiserson, R. Rivest and C. Stein, MIT Press

Complexity in Software Design