Data Structures and Algorithms

Chapter Outline

Chapter 11: Data Structures and Algorithms

This chapter introduces the fundamentals of Data Structures and Algorithms (DSA) using Python. Understanding DSA is essential for building efficient applications, preparing for technical interviews, and writing production-grade backend systems.

We’ll cover:

  • Core built-in Python data structures
  • Algorithmic operations like searching and sorting
  • Time and space complexity basics
  • Simple DSA-focused examples with testable code

11.1 Python’s Built-in Data Structures

TypeDescriptionExample
listOrdered, mutable collection[1, 2, 3]
tupleOrdered, immutable collection(1, 2, 3)
setUnordered collection of unique elements{1, 2, 3}
dictKey-value store{"name": "Alice"}
strSequence of Unicode characters"Hello"

Example: Stack using a List

python
stack = []
stack.append(1)
stack.append(2)
print(stack.pop()) # Output: 2

11.2 Time and Space Complexity (Big-O Notation)

As you develop algorithms and write more advanced code, it’s essential to evaluate how efficiently your code runs. This is where Big-O Notation comes into play. It helps us describe how an algorithm scales with input size.

What is Big-O Notation?

Big-O Notation is a mathematical notation that describes the upper bound of an algorithm's running time or space usage in the worst-case scenario.

Think of it as answering:

"How does performance change as input size grows?"

Time Complexity

Time complexity describes how long an algorithm takes to run based on the size of the input (n).

Big-O NotationNameExample Algorithm
O(1)Constant TimeAccessing an element by index
O(log n)Logarithmic TimeBinary search in a sorted list
O(n)Linear TimeIterating through a list
O(n log n)Log-linear TimeMerge sort, quicksort average case
O(n^2)Quadratic TimeNested loops (e.g., bubble sort)
O(2^n)Exponential TimeRecursive Fibonacci without memoization
O(n!)Factorial TimeSolving the traveling salesman problem

Space Complexity

Space complexity refers to how much memory an algorithm uses as the input size grows.

For example:

python
def sum_list(nums):
total = 0 # O(1) space
for num in nums:
total += num
return total
  • Time complexity: O(n) – we loop over n elements
  • Space complexity: O(1) – only one total variable is used

Additional Examples

Constant Time - O(1)

python
def get_first_item(lst):
return lst[0]

This takes the same amount of time no matter how big lst is.

Linear Time - O(n)

python
def print_items(lst):
for item in lst:
print(item)

As n grows, the number of operations grows linearly.

Quadratic Time - O(n^2)

python
def print_pairs(lst):
for i in lst:
for j in lst:
print(i, j)

For each element, we iterate again over the list.

Logarithmic Time - O(log n)

python
1def binary_search(sorted_list, target):
2 low, high = 0, len(sorted_list) - 1
3 while low <= high:
4 mid = (low + high) // 2
5 if sorted_list[mid] == target:
6 return mid
7 elif sorted_list[mid] < target:
8 low = mid + 1
9 else:
10 high = mid - 1
11 return -1

Each iteration halves the search space.

Comparing Time Complexities with Growth

Imagine you're processing a list of 1,000 elements:

  • O(1): 1 step
  • O(log n): ~10 steps
  • O(n): 1,000 steps
  • O(n log n): ~10,000 steps
  • O(n²): 1,000,000 steps

This is why choosing the right algorithm matters!

11.3 Common Algorithms in Python

Now that we've discussed the Big-O Notation, let's review some common algorithms in Python.

python
1def linear_search(arr, target):
2 for i, value in enumerate(arr):
3 if value == target:
4 return i
5 return -1

Time Complexity

Best Case: O(1)

  • If the target is the first element in the list.

Worst Case: O(n)

  • If the target is at the end or not in the list at all.

Average Case: O(n)

  • On average, the target will be found halfway through the list (assuming uniform distribution).

Space Complexity

Average Case: O(1) — constant space.

  • The algorithm only uses a few variables (i, value, and target) regardless of input size.
  • It does not use any additional data structures or recursive calls.
python
1def binary_search(arr, target):
2 left, right = 0, len(arr) - 1
3
4 while left <= right:
5 mid = (left + right) // 2
6 if arr[mid] == target:
7 return mid
8 elif arr[mid] < target:
9 left = mid + 1
10 else:
11 right = mid - 1
12 return -1

Note: Requires the array to be sorted.

Time Complexity

Best Case: O(1)

  • if the target is the middle left element.

Worst Case: O(log n)

  • The array is halved in each step, so the maximum number of iterations is logarithmic in base 2.

Average Case: O(log n)

  • Even if the element is found randomly or not found at all, it still goes through log(n) comparisons.

Space Complexity

Average Case: O(1)

  • The algorithm only uses a few variables (left, right, and mid) regardless of input size.
  • It does not use any additional data structures or recursive calls.

Bubble Sort

python
1def bubble_sort(arr):
2 n = len(arr)
3 for i in range(n):
4 for j in range(0, n-i-1):
5 if arr[j] > arr[j+1]:
6 arr[j], arr[j+1] = arr[j+1], arr[j]
7 return arr

Time Complexity

Best Case: O(n)

  • When the array is already sorted and you optimize the algorithm to detect no swaps and break early (your version does not do this, so technically it's still O(n²)).
  • If optimized: one pass, no swaps → detect sorted → exit early.

Average Case: O(n²)

  • Regardless of data distribution, you'll generally end up doing around n*(n-1)/2 comparisons and swaps.

Worst Case: O(n²)

  • The array is sorted in reverse. Every pass makes n-i-1 comparisons.
  • Total comparisons:
python
(n-1) + (n-2) + ... + 1 = n(n-1) / 2

11.4 Additional Useful Data Structures in Python

In addition to Python's built-in types (list, set, tuple, dict), there are several other useful data structures in the collections and heapq modules that are worth looking into further.

StructureSource ModuleDescription
dequecollectionsDouble-ended queue with O(1) append and pop operations on both ends. Great for queues and sliding window problems.
defaultdictcollectionsDictionary with a default value factory. Excellent for frequency counts and grouping items.
OrderedDictcollectionsDict that preserves insertion order (mostly obsolete after Python 3.7 where dicts do the same).
CountercollectionsDict subclass for counting hashable items.
namedtuplecollectionsImmutable and self-documenting tuple subclass with named fields.
heapqheapqBinary heap implementation, useful for priority queues and efficient minimum retrieval.
queue.QueuequeueThread-safe FIFO queue, useful in concurrency contexts.
array.arrayarrayCompact storage of basic C-style data types (int, float).
set and frozensetbuilt-inMutable and immutable sets with fast membership tests.
bitarray (3rd-party)bitarrayEfficient arrays of booleans or bits (optional, if using external libraries).

Problem solving in Computer Science revolves around data structures and algorithms. Structuring a problem using optimal data structures and using the correct algorithms to solve the problem have great impact on the performance and usability for the software.

Check your understanding

Test your knowledge of Advanced Object-Oriented Programming in Python

Feedback