Python Standard Libraries

Chapter Outline

Python's standard library is one of its greatest strengths. It provides a wide array of modules that cover everything from file handling and math to networking and concurrency. These libraries help developers avoid reinventing the wheel and build powerful applications quickly.

In this chapter, you'll learn:

  • What the standard library is
  • Key modules grouped by category
  • How to use them with practical examples
  • A real-world example that combines several libraries
  • Chapter quiz

8.1 What is the Python Standard Library?

The Python Standard Library is a collection of modules and packages bundled with Python that you can import and use without installing anything separately. These are maintained by the core Python team and are highly reliable.

To use a standard library module:

import os
import math

8.2 Categories and Common Modules

Category Common Modules Purpose
File and Directory Access os, os.path, shutil, pathlib, tempfile, glob, fnmatch Work with the file system: path manipulation, file operations, temp files
Data Serialization json, pickle, marshal, csv, xml, plistlib Read/write structured data: JSON, CSV, XML, binary
Data Persistence / DB sqlite3, dbm, shelve Lightweight database engines and key-value stores
Text Processing re, string, textwrap, difflib, unicodedata, html, xml.etree.ElementTree Regular expressions, text formatting, Unicode handling, HTML/XML parsing
Numeric and Math math, decimal, fractions, random, statistics Basic and advanced math, precise calculations, randomness
Date and Time datetime, time, calendar, zoneinfo Time zones, formatting dates, timestamps
Concurrency / Parallelism threading, multiprocessing, asyncio, queue, concurrent.futures Multi-threading, multi-processing, async I/O
Networking and Internet socket, http, urllib, ftplib, imaplib, smtplib, poplib, ssl, email, xmlrpc Build clients/servers, send emails, HTTP requests, secure sockets
Web and Internet Data html.parser, urllib, http.client, xml, cgi, json, email Parsing and consuming internet data
System Utilities sys, platform, getopt, argparse, logging, traceback, atexit Command-line tools, platform info, logging, error handling
Security and Cryptography hashlib, hmac, secrets, ssl, crypt, uuid Secure hashes, token generation, encryption
Development Tools pdb, doctest, unittest, trace, profile, timeit, cProfile Debugging, profiling, unit testing
Import/Runtime importlib, types, inspect, sys, builtins Dynamically import modules, inspect code during runtime
Operating System Services os, signal, resource, subprocess, shutil Interface with OS, signals, process spawning
Collections / Containers collections, heapq, array, bisect, queue, weakref, types Extended containers (e.g., deque, defaultdict), heaps, memory-safe refs
Compression / Archiving zipfile, tarfile, gzip, bz2, lzma, shutil Read/write compressed and archived files
Testing and QA unittest, doctest, test, pdb, trace, coverage (3rd-party) Writing and running tests
Code Compilation/Execution compile, eval, exec, code, ast, tokenize, dis Dynamic code execution, parsing, AST analysis
GUI Programming tkinter Build desktop apps using the Tk GUI toolkit
Internationalization gettext, locale, calendar Localization, translations, culture-specific formatting
Miscellaneous uuid, base64, time, warnings, copy, contextlib, dataclasses, enum, functools, itertools Utilities for general programming and Pythonic idioms

8.3 Example usage

Let's now review the usage of some of these standard libraries. In later chapters we'll review some of these libraries in further details to understand additional concepts and features.

File and Directory Management

Module Description
os File paths, directory management, environment variables
shutil File operations (copy, move, delete)
pathlib Object-oriented path handling
tempfile Temporary file creation
import os
from pathlib import Path

# Current directory
print(os.getcwd())

# Create a new directory
Path("new_folder").mkdir(exist_ok=True)

# List files
print(os.listdir("."))

Data Serialization

Module Format Use
json JSON Web data, APIs
csv CSV Tabular data
pickle Binary Object serialization
yaml YAML Requires PyYAML (3rd party)
import json

data = {"name": "Alice", "age": 30}
with open("user.json", "w") as f:
    json.dump(data, f)

Data Structures and Algorithms

Module Purpose
collections Extra data types like deque, Counter
heapq Heap queue (priority queue)
bisect Binary search
from collections import Counter

names = ["Alice", "Bob", "Alice"]
print(Counter(names))  # {'Alice': 2, 'Bob': 1}

Date and Time

Module Description
datetime Date and time manipulation
time Timestamps, delays
calendar Calendar-based calculations
from datetime import datetime

now = datetime.now()
print(now.strftime("%Y-%m-%d %H:%M"))

Math and Statistics

Module Functionality
math Advanced math functions
statistics Mean, median, mode
random Random number generation
import random

print(random.randint(1, 10))

Networking and APIs

Module Purpose
http.client Low-level HTTP client
urllib URL handling and requests
socket TCP/IP networking
from urllib.request import urlopen

with urlopen("https://www.example.com") as res:
    print(res.read().decode())

Testing and Debugging

Module Purpose
unittest Unit testing framework
doctest Test examples in docstrings
pdb Python debugger
logging App event tracking
import logging

logging.basicConfig(level=logging.INFO)
logging.info("App started")

Concurrency and Parallelism

Module Purpose
threading Lightweight parallel tasks
multiprocessing CPU-bound parallelism
asyncio Async IO and cooperative multitasking
import asyncio

async def say_hello():
    await asyncio.sleep(1)
    print("Hello async!")

asyncio.run(say_hello())

8.4 Real-World Example: Log Archiver Utility

This utility reads logs from a directory, compresses them, and archives them into a timestamped folder.

Directory Structure Before & After

logs/
  ├── app.log
  ├── errors.log
  └── debug.log

archive/
  └── logs_2023-08-05_14-30-00/
      ├── app.log.gz
      ├── errors.log.gz
      └── debug.log.gz

The Script

import os
import gzip
import shutil
from datetime import datetime
from pathlib import Path


def compress_file(source_path: Path, dest_path: Path):
    """Compress a log file using gzip."""
    with open(source_path, 'rb') as f_in:
        with gzip.open(dest_path, 'wb') as f_out:
            shutil.copyfileobj(f_in, f_out)


def archive_logs(log_dir: str = "logs", archive_root: str = "archive"):
    log_dir_path = Path(log_dir)
    archive_root_path = Path(archive_root)

    if not log_dir_path.exists() or not log_dir_path.is_dir():
        print(f"Log directory '{log_dir}' not found.")
        return

    timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
    archive_dir = archive_root_path / f"logs_{timestamp}"
    archive_dir.mkdir(parents=True, exist_ok=True)

    log_files = list(log_dir_path.glob("*.log"))
    if not log_files:
        print("No .log files found.")
        return

    for log_file in log_files:
        compressed_file = archive_dir / f"{log_file.stem}.log.gz"
        compress_file(log_file, compressed_file)
        print(f"Archived: {log_file}{compressed_file}")

    print(f"All logs archived to: {archive_dir}")


if __name__ == "__main__":
    archive_logs()

8.5 Summary

You’ve learned how Python’s standard libraries provide built-in solutions for:

  • File handling
  • Dates and times
  • Data serialization
  • HTTP requests
  • Concurrency
  • Logging and testing

You should delve deeper into the various libraries and study the various functions that are provided. Mastering the standard Python libraries would allow you to build applications paster with fewer dependencies.

What is next?

In the next chapter we'll be studying Concurrency and Process Management in Python, using namely the multiprocessing and subprocess modules.

Check your understanding

Test your knowledge of Advanced Object-Oriented Programming in Python