Python & AI Tutorials Logo
Python Programming

39. Essential Standard Library Modules

Python's standard library is a collection of modules that come built-in with Python—you don't need to install anything extra to use them. These modules provide powerful tools for common programming tasks: generating random numbers, working with dates and times, exchanging data with other programs, and using specialized data structures that go beyond basic lists and dictionaries.

In this chapter, we'll explore five essential standard library modules that you'll use frequently in real-world Python programming.

39.1) Generating Randomness with random

The random module provides functions for generating random numbers and making random selections. This is useful for simulations, games, testing, sampling data, and any situation where you need unpredictable behavior.

39.1.1) Generating Random Integers with randint()

The randint() function generates a random integer between two values, inclusive on both ends:

python
import random
 
# Simulate rolling a six-sided die
die_roll = random.randint(1, 6)
print(f"You rolled: {die_roll}")  # Output: You rolled: 4 (varies each run)
 
# Generate a random age between 18 and 65
age = random.randint(18, 65)
print(f"Random age: {age}")  # Output: Random age: 42 (varies)

Notice that both the start and end values are included in the possible results. randint(1, 6) can return 1, 2, 3, 4, 5, or 6—all six values are possible.

Here's a practical example that simulates multiple dice rolls:

python
import random
 
# Simulate rolling two dice and calculating their sum
die1 = random.randint(1, 6)
die2 = random.randint(1, 6)
total = die1 + die2
 
print(f"Die 1: {die1}")  # Output: Die 1: 3 (varies)
print(f"Die 2: {die2}")  # Output: Die 2: 5 (varies)
print(f"Total: {total}")  # Output: Total: 8 (varies)
 
if total == 7:
    print("Lucky seven!")
elif total == 2 or total == 12:
    print("Snake eyes or boxcars!")

Why both ends are inclusive: This makes randint() intuitive for common use cases. When you want a number from 1 to 6 (like a die), you write randint(1, 6) and both 1 and 6 are possible results.

39.1.2) Generating Random Floating-Point Numbers

For random decimal numbers, use random() (returns a float between 0.0 and 1.0) or uniform() (returns a float between two specified values):

python
import random
 
# Generate a random float between 0.0 and 1.0 (0.0 included, 1.0 excluded)
probability = random.random()
print(f"Random probability: {probability:.4f}")  # Output: Random probability: 0.7284 (varies)
 
# Generate a random temperature between 15.0 and 30.0 degrees
temperature = random.uniform(15.0, 30.0)
print(f"Temperature: {temperature:.2f}°C")  # Output: Temperature: 23.47°C (varies)
 
# Generate a random price between $10.00 and $99.99
price = random.uniform(10.0, 99.99)
print(f"Price: ${price:.2f}")  # Output: Price: $45.67 (varies)

The random() function is useful when you need a probability value or a percentage. The uniform() function is better when you need a random decimal in a specific range.

39.1.3) Making Random Choices with choice()

The choice() function randomly selects one element from a sequence (list, tuple, or string):

python
import random
 
# Randomly select a color
colors = ["red", "blue", "green", "yellow", "purple"]
selected_color = random.choice(colors)
print(f"Selected color: {selected_color}")  # Output: Selected color: green (varies)
 
# Randomly select a winner from participants
participants = ["Alice", "Bob", "Charlie", "Diana"]
winner = random.choice(participants)
print(f"The winner is: {winner}")  # Output: The winner is: Bob (varies)
 
# Randomly select a character from a string
vowels = "aeiou"
random_vowel = random.choice(vowels)
print(f"Random vowel: {random_vowel}")  # Output: Random vowel: i (varies)

This is particularly useful for games, random sampling, or selecting random test data. Each element in the sequence has an equal probability of being chosen.

Here's a more complex example that simulates a simple quiz game:

python
import random
 
# Quiz questions with their answers
questions = [
    ("What is 2 + 2?", "4"),
    ("What is the capital of France?", "Paris"),
    ("What color is the sky?", "blue")
]
 
# Randomly select a question
question, correct_answer = random.choice(questions)
print(f"Question: {question}")
 
user_answer = input("Your answer: ")
if user_answer.lower() == correct_answer.lower():
    print("Correct!")
else:
    print(f"Wrong! The answer was: {correct_answer}")

39.1.4) Selecting Multiple Random Items with sample()

When you need to select multiple unique items from a sequence, use sample(). This is like drawing cards from a deck without replacement—once an item is selected, it won't be selected again:

python
import random
 
# Select 3 random students for a group project
students = ["Alice", "Bob", "Charlie", "Diana", "Eve", "Frank"]
group = random.sample(students, 3)
print(f"Group members: {group}")  # Output: Group members: ['Diana', 'Alice', 'Frank'] (varies)
 
# Draw 5 lottery numbers from 1 to 50 (no duplicates)
lottery_numbers = random.sample(range(1, 51), 5)
lottery_numbers.sort()  # Sort for display
print(f"Lottery numbers: {lottery_numbers}")  # Output: Lottery numbers: [7, 15, 23, 38, 49] (varies)

The second argument to sample() specifies how many items to select. The number must be less than or equal to the length of the sequence—you can't select more items than are available.

39.1.5) Shuffling Sequences with shuffle()

The shuffle() function randomly reorders the elements of a list in place (modifying the original list):

python
import random
 
# Shuffle a deck of cards
cards = ["A♠", "K♠", "Q♠", "J♠", "10♠", "9♠", "8♠", "7♠"]
print(f"Original: {cards}")
random.shuffle(cards)
print(f"Shuffled: {cards}")  # Output: Shuffled: ['Q♠', '7♠', 'A♠', '10♠', '9♠', 'J♠', 'K♠', '8♠'] (varies)
 
# Shuffle quiz questions for randomized order
questions = ["Question 1", "Question 2", "Question 3", "Question 4"]
random.shuffle(questions)
print(f"Randomized order: {questions}")  # Output: Randomized order: ['Question 3', 'Question 1', 'Question 4', 'Question 2'] (varies)

Random Module Functions

randint: Random integers

random/uniform: Random floats

choice: Pick one item

sample: Pick multiple unique items

shuffle: Reorder list in place

Inclusive on both ends

random: 0.0 to 1.0

uniform: Custom range

Equal probability for each

No duplicates

Modifies original list

39.2) Working with Dates and Times

The datetime module provides classes for working with dates, times, and time intervals. This is essential for scheduling, logging, calculating durations, and any application that needs to track when things happen.

39.2.1) Getting the Current Date and Time

The datetime class represents a specific point in time with both date and time components:

python
from datetime import datetime
 
# Get the current date and time
now = datetime.now()
print(f"Current datetime: {now}")
# Output: Current datetime: 2026-01-02 14:30:45.123456
 
# Access individual components
print(f"Year: {now.year}")      # Output: Year: 2026
print(f"Month: {now.month}")    # Output: Month: 1
print(f"Day: {now.day}")        # Output: Day: 2
print(f"Hour: {now.hour}")      # Output: Hour: 14
print(f"Minute: {now.minute}")  # Output: Minute: 30
print(f"Second: {now.second}")  # Output: Second: 45

For just the date (without time), use the date class:

python
from datetime import date
 
# Get today's date
today = date.today()
print(f"Today: {today}")  # Output: Today: 2026-01-02
 
print(f"Year: {today.year}")    # Output: Year: 2026
print(f"Month: {today.month}")  # Output: Month: 1
print(f"Day: {today.day}")      # Output: Day: 2

39.2.2) Creating Specific Dates and Times

You can create datetime and date objects for specific points in time:

python
from datetime import datetime, date
 
# Create a specific date
birthday = date(1995, 7, 15)
print(f"Birthday: {birthday}")  # Output: Birthday: 1995-07-15
 
# Create a specific datetime
meeting = datetime(2026, 3, 15, 14, 30)  # March 15, 2026 at 2:30 PM
print(f"Meeting: {meeting}")  # Output: Meeting: 2026-03-15 14:30:00

This is useful for representing deadlines, appointments, historical dates, or any fixed point in time:

python
from datetime import date
 
# Important dates in a project
project_start = date(2026, 1, 15)
project_end = date(2026, 6, 30)
 
print(f"Project duration: {project_start} to {project_end}")
# Output: Project duration: 2026-01-15 to 2026-06-30

39.2.3) Calculating Time Differences with timedelta

The timedelta class represents a duration—the difference between two dates or times. You can use it to calculate how much time has passed or to add/subtract time from dates:

python
from datetime import date, timedelta
 
# Calculate age
birth_date = date(1995, 7, 15)
today = date(2026, 1, 2)
age_delta = today - birth_date
 
print(f"Days since birth: {age_delta.days}")  # Output: Days since birth: 11128
print(f"Years (approximate): {age_delta.days // 365}")  # Output: Years (approximate): 30

When you subtract one date from another, you get a timedelta object. The days attribute tells you the number of days in that duration.

You can also create timedelta objects directly to represent specific durations:

python
from datetime import date, timedelta
 
# Add days to a date
today = date(2026, 1, 2)
one_week = timedelta(days=7)
next_week = today + one_week
 
print(f"Today: {today}")        # Output: Today: 2026-01-02
print(f"Next week: {next_week}")  # Output: Next week: 2026-01-09
 
# Subtract days from a date
thirty_days_ago = today - timedelta(days=30)
print(f"30 days ago: {thirty_days_ago}")  # Output: 30 days ago: 2025-12-03

timedelta can represent days, seconds, microseconds, milliseconds, minutes, hours, and weeks:

python
from datetime import datetime, timedelta
 
# Calculate a deadline
now = datetime(2026, 1, 2, 14, 30)
deadline = now + timedelta(hours=48, minutes=30)
 
print(f"Current time: {now}")    # Output: Current time: 2026-01-02 14:30:00
print(f"Deadline: {deadline}")   # Output: Deadline: 2026-01-04 15:00:00
 
# Calculate time remaining
time_left = deadline - now
print(f"Hours remaining: {time_left.total_seconds() / 3600}")  # Output: Hours remaining: 48.5

The total_seconds() method converts the entire duration to seconds, which you can then convert to hours, minutes, or any other unit.

Here's a practical example calculating project milestones:

python
from datetime import date, timedelta
 
# Project planning
project_start = date(2026, 1, 15)
sprint_duration = timedelta(weeks=2)
 
sprint_1_end = project_start + sprint_duration
sprint_2_end = sprint_1_end + sprint_duration
sprint_3_end = sprint_2_end + sprint_duration
 
print(f"Sprint 1: {project_start} to {sprint_1_end}")
# Output: Sprint 1: 2026-01-15 to 2026-01-29
print(f"Sprint 2: {sprint_1_end} to {sprint_2_end}")
# Output: Sprint 2: 2026-01-29 to 2026-02-12
print(f"Sprint 3: {sprint_2_end} to {sprint_3_end}")
# Output: Sprint 3: 2026-02-12 to 2026-02-26

39.2.4) Comparing Dates and Times

Date and datetime objects can be compared using standard comparison operators:

python
from datetime import date
 
# Compare dates
date1 = date(2026, 1, 15)
date2 = date(2026, 2, 20)
date3 = date(2026, 1, 15)
 
print(date1 < date2)   # Output: True
print(date1 == date3)  # Output: True
print(date2 > date1)   # Output: True

This is useful for checking deadlines, validating date ranges, and sorting dates:

python
from datetime import date
 
# Check if a date is in the past
event_date = date(2025, 12, 25)
today = date(2026, 1, 2)
 
if event_date < today:
    print("This event has already passed")  # Output: This event has already passed
else:
    print("This event is upcoming")
 
# Sort a list of dates
important_dates = [
    date(2026, 3, 15),
    date(2026, 1, 10),
    date(2026, 2, 28)
]
 
important_dates.sort()
print("Dates in order:")  # Output: Dates in order:
for d in important_dates:
    print(f"  {d}")
# Output:
#   2026-01-10
#   2026-02-28
#   2026-03-15

39.2.5) Formatting Dates and Times with strftime()

The strftime() method (string format time) converts dates and times into formatted strings. You specify the format using special codes:

python
from datetime import datetime
 
now = datetime(2026, 1, 2, 14, 30, 45)
 
# Common date formats
print(now.strftime("%Y-%m-%d"))           # Output: 2026-01-02
print(now.strftime("%m/%d/%Y"))           # Output: 01/02/2026
print(now.strftime("%B %d, %Y"))          # Output: January 02, 2026
print(now.strftime("%A, %B %d, %Y"))      # Output: Friday, January 02, 2026
 
# Common time formats
print(now.strftime("%H:%M:%S"))           # Output: 14:30:45
print(now.strftime("%I:%M %p"))           # Output: 02:30 PM
 
# Combined formats
print(now.strftime("%Y-%m-%d %H:%M:%S"))  # Output: 2026-01-02 14:30:45
print(now.strftime("%B %d, %Y at %I:%M %p"))  # Output: January 02, 2026 at 02:30 PM

Common format codes:

CodeDescriptionExample
%YYear with century2026
%mMonth as zero-padded number (01-12)01
%dDay as zero-padded number (01-31)02
%BFull month nameJanuary
%bShort month nameJan
%AFull weekday nameFriday
%aShort weekday nameFri
%HHour 24-hour (00-23)14
%IHour 12-hour (01-12)02
%MMinute (00-59)30
%SSecond (00-59)45
%pAM/PMPM

Here's a practical example creating a log entry:

python
from datetime import datetime
 
def log_event(message):
    """Log an event with a timestamp"""
    now = datetime.now()
    timestamp = now.strftime("%Y-%m-%d %H:%M:%S")
    print(f"[{timestamp}] {message}")
 
log_event("User logged in")
# Output: [2026-01-02 14:30:45] User logged in
 
log_event("File uploaded successfully")
# Output: [2026-01-02 14:30:45] File uploaded successfully

39.2.6) Parsing Dates from Strings with strptime()

The strptime() function (string parse time) converts formatted strings back into datetime objects. You specify the same format codes to tell Python how to interpret the string:

python
from datetime import datetime
 
# Parse different date formats
date_str1 = "2026-01-15"
date1 = datetime.strptime(date_str1, "%Y-%m-%d")
print(f"Parsed: {date1}")  # Output: Parsed: 2026-01-15 00:00:00
 
date_str2 = "January 15, 2026"
date2 = datetime.strptime(date_str2, "%B %d, %Y")
print(f"Parsed: {date2}")  # Output: Parsed: 2026-01-15 00:00:00
 
# Parse datetime with time
datetime_str = "2026-01-15 14:30:00"
dt = datetime.strptime(datetime_str, "%Y-%m-%d %H:%M:%S")
print(f"Parsed: {dt}")  # Output: Parsed: 2026-01-15 14:30:00

This is essential when reading dates from files, user input, or external data sources:

python
from datetime import datetime
 
# Parse user input
user_input = "03/15/2026"
try:
    event_date = datetime.strptime(user_input, "%m/%d/%Y")
    print(f"Event scheduled for: {event_date.strftime('%B %d, %Y')}")
    # Output: Event scheduled for: March 15, 2026
except ValueError:
    print("Invalid date format. Please use MM/DD/YYYY")

Important: The format string must match the input string exactly, or you'll get a ValueError:

python
from datetime import datetime
 
# This will fail - format doesn't match
try:
    datetime.strptime("2026-01-15", "%m/%d/%Y")  # Wrong format
except ValueError as e:
    print(f"Error: {e}")
    # Output: Error: time data '2026-01-15' does not match format '%m/%d/%Y'

datetime Module

datetime.now: Current date/time

date.today: Current date

datetime/date: Create specific dates

timedelta: Time durations

strftime: Format to string

strptime: Parse from string

Add/subtract from dates

Calculate differences

%Y, %m, %d, %H, %M, %S

Must match format exactly

39.3) Reading and Writing JSON Data

JSON (JavaScript Object Notation) is a text format for storing and exchanging structured data. It's the most common format for web APIs, configuration files, and data exchange between programs. Python's json module makes it easy to convert between Python data structures and JSON text.

39.3.1) Understanding JSON Structure

JSON looks similar to Python dictionaries and lists, but with some differences:

JSON supports these data types:

  • Objects (like Python dictionaries): {"name": "Alice", "age": 30}
  • Arrays (like Python lists): [1, 2, 3, 4]
  • Strings: "hello" (must use double quotes)
  • Numbers: 42, 3.14
  • Booleans: true, false (lowercase)
  • Null: null (like Python's None)

Key differences from Python:

  • JSON uses true/false/null instead of Python's True/False/None
  • JSON strings must use double quotes ("text"), not single quotes
  • JSON doesn't support tuples, sets, or custom objects directly

Here's what JSON data looks like:

json
{
    "name": "Alice Johnson",
    "age": 30,
    "email": "alice@example.com",
    "is_active": true,
    "scores": [85, 92, 78, 95],
    "address": {
        "street": "123 Main St",
        "city": "Springfield",
        "zip": "12345"
    }
}

Note: This is pure JSON text, not Python code. Notice the lowercase true and the use of double quotes.

39.3.2) Converting Python Data to JSON with dumps()

The dumps() function (dump string) converts Python data structures to JSON-formatted strings:

python
import json
 
student = {
    "name": "Alice Johnson",
    "age": 30,
    "email": "alice@example.com",
    "is_active": True,
    "scores": [85, 92, 78, 95]
}
 
# Convert a dictionary to JSON
json_string = json.dumps(student)
print(json_string)
# Output: {"name": "Alice Johnson", "age": 30, "email": "alice@example.com", "is_active": true, "scores": [85, 92, 78, 95]}
 
print(type(json_string))  # Output: <class 'str'>

Notice how Python's True became JSON's true in the output. The dumps() function automatically handles these conversions.

For more readable output, use the indent parameter:

python
import json
 
student = {
    "name": "Alice Johnson",
    "age": 30,
    "scores": [85, 92, 78, 95]
}
 
# Pretty-print with indentation
json_string = json.dumps(student, indent=2)
print(json_string)
# Output:
# {
#   "name": "Alice Johnson",
#   "age": 30,
#   "scores": [
#     85,
#     92,
#     78,
#     95
#   ]
# }

The indent parameter specifies how many spaces to use for each indentation level. This makes JSON much easier to read, especially for complex nested structures.

39.3.3) Converting JSON to Python Data with loads()

The loads() function (load string) converts JSON-formatted strings back into Python data structures:

python
import json
 
# JSON string (as you might receive from a web API)
json_string = '{"name": "Bob Smith", "age": 25, "scores": [90, 88, 92]}'
 
# Convert to Python dictionary
student = json.loads(json_string)
print(student)  # Output: {'name': 'Bob Smith', 'age': 25, 'scores': [90, 88, 92]}
print(type(student))  # Output: <class 'dict'>
 
# Access the data like any Python dictionary
print(f"Name: {student['name']}")  # Output: Name: Bob Smith
print(f"Average score: {sum(student['scores']) / len(student['scores'])}")
# Output: Average score: 90.0

JSON's true, false, and null are automatically converted to Python's True, False, and None:

python
import json
 
json_string = '{"active": true, "verified": false, "middle_name": null}'
data = json.loads(json_string)
 
print(data)  # Output: {'active': True, 'verified': False, 'middle_name': None}
print(type(data["active"]))  # Output: <class 'bool'>
print(type(data["middle_name"]))  # Output: <class 'NoneType'>

39.3.4) Writing JSON to Files with dump()

The dump() function writes Python data directly to a file in JSON format:

python
import json
 
# Student records
students = [
    {"name": "Alice", "age": 20, "gpa": 3.8},
    {"name": "Bob", "age": 22, "gpa": 3.5},
    {"name": "Charlie", "age": 21, "gpa": 3.9}
]
 
# Write to a JSON file
with open("students.json", "w") as file:
    json.dump(students, file, indent=2)
 
print("Data written to students.json")
# Output: Data written to students.json

After running this code, the file students.json contains:

json
[
  {
    "name": "Alice",
    "age": 20,
    "gpa": 3.8
  },
  {
    "name": "Bob",
    "age": 22,
    "gpa": 3.5
  },
  {
    "name": "Charlie",
    "age": 21,
    "gpa": 3.9
  }
]

Why use dump() instead of dumps()? The dump() function writes directly to a file, which is more efficient than converting to a string first and then writing the string. Use dump() for files and dumps() when you need the JSON as a string (for example, to send over a network).

39.3.5) Reading JSON from Files with load()

The load() function reads JSON data from a file and converts it to Python data structures:

python
import json
 
# Read from the JSON file we created earlier
with open("students.json", "r") as file:
    students = json.load(file)
 
print(f"Loaded {len(students)} students")  # Output: Loaded 3 students
 
# Work with the data
for student in students:
    print(f"{student['name']}: GPA {student['gpa']}")
# Output:
# Alice: GPA 3.8
# Bob: GPA 3.5
# Charlie: GPA 3.9

39.3.6) Handling JSON Errors

When working with JSON, you might encounter invalid data. Always handle potential errors:

python
import json
 
# Invalid JSON - missing closing quote
invalid_json = '{"name": "Alice", "age": 30'
 
try:
    data = json.loads(invalid_json)
except json.JSONDecodeError as e:
    print(f"Invalid JSON: {e}")
    # Output: Invalid JSON: Expecting ',' delimiter: line 1 column 28 (char 27)

This is especially important when reading JSON from external sources (files, web APIs, user input) where you can't guarantee the data is valid:

python
import json
 
def load_config(filename):
    """Load configuration from a JSON file with error handling"""
    try:
        with open(filename, "r") as file:
            config = json.load(file)
            return config
    except FileNotFoundError:
        print(f"Config file '{filename}' not found")
        return None
    except json.JSONDecodeError as e:
        print(f"Invalid JSON in '{filename}': {e}")
        return None
 
# Try to load configuration
config = load_config("config.json")
if config:
    print(f"Configuration loaded: {config}")
else:
    print("Using default configuration")

39.3.7) Practical JSON Example: Saving and Loading Application State

Here's a complete example showing how to save and load application data:

python
import json
 
def save_game_state(filename, player_data):
    """Save game state to a JSON file"""
    with open(filename, "w") as file:
        json.dump(player_data, file, indent=2)
    print(f"Game saved to {filename}")
 
def load_game_state(filename):
    """Load game state from a JSON file"""
    try:
        with open(filename, "r") as file:
            player_data = json.load(file)
        print(f"Game loaded from {filename}")
        return player_data
    except FileNotFoundError:
        print("No saved game found")
        return None
 
# Game data
player = {
    "name": "Hero",
    "level": 5,
    "health": 85,
    "inventory": ["sword", "shield", "potion"],
    "position": {"x": 10, "y": 20}
}
 
# Save the game
save_game_state("savegame.json", player)
# Output: Game saved to savegame.json
 
# Later, load the game
loaded_player = load_game_state("savegame.json")
# Output: Game loaded from savegame.json
 
if loaded_player:
    print(f"Welcome back, {loaded_player['name']}!")
    print(f"Level: {loaded_player['level']}, Health: {loaded_player['health']}")
    # Output:
    # Welcome back, Hero!
    # Level: 5, Health: 85

json Module

dumps: Python → JSON string

loads: JSON string → Python

dump: Python → JSON file

load: JSON file → Python

indent parameter for readability

Handles type conversions

More efficient than dumps + write

Handle JSONDecodeError

39.4) Practical Containers in collections

The collections module provides specialized container types that extend Python's built-in containers (lists, dictionaries, sets) with additional functionality. These containers solve common problems more elegantly than using basic data structures.

39.4.1) Counting Items with Counter

The Counter class is designed for counting hashable objects. It's a dictionary subclass that stores items as keys and their counts as values.

What Counter accepts as input:

  • Any iterable (list, string, tuple, etc.)
  • Another dictionary with counts
  • Keyword arguments with counts

What Counter stores:

  • A dictionary where keys are the items and values are their counts
  • Example: Counter(['a', 'b', 'a']) stores {'a': 2, 'b': 1}

Key advantage over regular dictionaries:

  • Returns 0 for missing keys instead of raising KeyError
  • Provides counting-specific methods like most_common()
  • Supports arithmetic operations between counters

Basic Usage

python
from collections import Counter
 
# Count letters in a word
word = "mississippi"
letter_counts = Counter(word)
print(letter_counts)
# Output: Counter({'i': 4, 's': 4, 'p': 2, 'm': 1})
 
# Access counts like a dictionary
print(f"Number of 'i's: {letter_counts['i']}")
# Output: Number of 'i's: 4
 
print(f"Number of 'z's: {letter_counts['z']}")
# Output: Number of 'z's: 0 (returns 0 for missing keys, no KeyError!)

Creating Counters from Different Sources

python
from collections import Counter
 
# From a list
votes = ["Alice", "Bob", "Alice", "Charlie", "Alice", "Bob", "Alice"]
vote_counts = Counter(votes)
print(vote_counts)
# Output: Counter({'Alice': 4, 'Bob': 2, 'Charlie': 1})
 
# From a string (counts each character)
letter_counts = Counter("hello")
print(letter_counts)
# Output: Counter({'l': 2, 'h': 1, 'e': 1, 'o': 1})
 
# From a dictionary
existing_counts = {'apple': 3, 'banana': 2}
fruit_counts = Counter(existing_counts)
print(fruit_counts)
# Output: Counter({'apple': 3, 'banana': 2})
 
# From keyword arguments
color_counts = Counter(red=5, blue=3, green=2)
print(color_counts)
# Output: Counter({'red': 5, 'blue': 3, 'green': 2})

Finding Most Common Items with most_common()

Method signature: most_common(n=None)

Parameters:

  • n (optional): Number of most common items to return
  • If n is omitted or None, returns all items

Returns:

  • A list of (item, count) tuples
  • Sorted by count, highest first
  • If counts are equal, items are in the order first encountered
python
from collections import Counter
 
# Analyze word frequency in text
text = "the quick brown fox jumps over the lazy dog the fox"
words = text.split()
word_counts = Counter(words)
 
# Get the 3 most common words
top_3 = word_counts.most_common(3)
print(top_3)
# Output: [('the', 3), ('fox', 2), ('quick', 1)]

Arithmetic Operations on Counters

You can add, subtract, and perform other operations on Counter objects:

python
from collections import Counter
 
# Count items in two groups
group1 = Counter(["apple", "banana", "apple", "orange"])
print(group1)
# Output: Counter({'apple': 2, 'banana': 1, 'orange': 1})
 
group2 = Counter(["banana", "banana", "grape", "apple"])
print(group2)
# Output: Counter({'banana': 2, 'grape': 1, 'apple': 1})
 
# Add counts together
combined = group1 + group2
print(combined)
# Output: Counter({'apple': 3, 'banana': 3, 'orange': 1, 'grape': 1})
 
# Subtract counts (only keeps positive results)
difference = group1 - group2
print(difference)
# Output: Counter({'apple': 1, 'orange': 1})
# banana: 1 - 2 = -1 (negative, so excluded)
# grape: not in group1, so excluded

Practical Example: Analyzing Student Grades

python
from collections import Counter
 
# Grade distribution
grades = ["A", "B", "A", "C", "B", "A", "B", "D", "A", "B", "C", "A"]
grade_counts = Counter(grades)
 
print(f"Total students: {len(grades)}")
# Output: Total students: 12
 
print("\nGrade Distribution:")
for grade, count in grade_counts.most_common():
    percentage = (count / len(grades)) * 100
    bar = "█" * count
    print(f"  {grade}: {count} students ({percentage:4.1f}%) {bar}")
# Output:
# Grade Distribution:
#   A: 5 students (41.7%) █████
#   B: 4 students (33.3%) ████
#   C: 2 students (16.7%) ██
#   D: 1 students ( 8.3%) █

39.4.2) Dictionaries with Default Values Using defaultdict

The defaultdict class is a dictionary subclass that automatically creates entries with a default value when you access a missing key. This eliminates the need for checking if keys exist before using them.

What defaultdict accepts as input:

  • A default factory function (required): A callable that returns the default value for missing keys
  • Any arguments that a regular dict accepts (key-value pairs, another dictionary, keyword arguments)

Key advantage over regular dictionaries:

  • No need to check if a key exists before using it
  • Automatically initializes missing keys with a default value
  • Cleaner, more readable code for grouping, counting, and accumulating operations

Understanding the Default Factory

When you create a defaultdict, you must provide a default factory—a callable (function) that takes no arguments and returns the default value. Common default factories:

  • int - returns 0 (useful for counting)
  • list - returns [] (useful for grouping items)
  • set - returns set() (useful for collecting unique items)
  • str - returns '' (useful for string concatenation)
  • lambda: value - returns a custom default value
python
from collections import defaultdict
 
# Different default factories
counts = defaultdict(int)        # Missing keys return 0
groups = defaultdict(list)       # Missing keys return []
unique = defaultdict(set)        # Missing keys return set()
custom = defaultdict(lambda: "N/A")  # Missing keys return "N/A"
 
# Test with missing keys
print(counts['missing'])     # Output: 0
print(groups['missing'])     # Output: []
print(unique['missing'])     # Output: set()
print(custom['missing'])     # Output: N/A

Basic Usage: Counting with defaultdict

Compare regular dictionary vs defaultdict for counting:

python
from collections import defaultdict
 
word = "mississippi"
 
# Regular dictionary - need to check if key exists
regular_dict = {}
for letter in word:
    if letter not in regular_dict:
        regular_dict[letter] = 0
    regular_dict[letter] += 1
 
print(regular_dict)
# Output: {'m': 1, 'i': 4, 's': 4, 'p': 2}
 
# defaultdict - automatically creates entries with default value
letter_counts = defaultdict(int)  # int() returns 0
for letter in word:
    letter_counts[letter] += 1  # No need to check if key exists!
 
print(dict(letter_counts))
# Output: {'m': 1, 'i': 4, 's': 4, 'p': 2}

How it works:

  1. When you access letter_counts[letter] for a new letter, defaultdict calls int() which returns 0
  2. The key is created with value 0, then += 1 makes it 1
  3. For existing keys, it behaves like a normal dictionary

Grouping Items with defaultdict(list)

A common use case is grouping items into categories:

python
from collections import defaultdict
 
students = [
    ("Alice", "A"),
    ("Bob", "B"),
    ("Charlie", "A"),
    ("Diana", "C"),
    ("Eve", "B"),
    ("Frank", "A")
]
 
# Group students by grade
# With defaultdict - clean and simple
students_by_grade = defaultdict(list)
for name, grade in students:
    students_by_grade[grade].append(name)
 
print(dict(students_by_grade))
# Output: {'A': ['Alice', 'Charlie', 'Frank'], 'B': ['Bob', 'Eve'], 'C': ['Diana']}
 
# Access a grade that doesn't exist yet
print(students_by_grade["D"])  # Output: [] (empty list, not KeyError!)

How it works:

  1. When you access students_by_grade[grade] for a new grade, defaultdict calls list() which returns []
  2. The key is created with an empty list, then .append(name) adds the first student
  3. For existing grades, it just appends to the existing list

Creating defaultdict from Existing Dictionary

You can initialize a defaultdict with existing data:

python
from collections import defaultdict
 
# Start with existing counts
existing_data = {'apple': 5, 'banana': 3}
 
# Create defaultdict from existing dictionary
fruit_counts = defaultdict(int, existing_data)
 
# Add more counts
fruit_counts['apple'] += 2     # 5 + 2 = 7
fruit_counts['orange'] += 1    # 0 + 1 = 1 (new key, starts at 0)
 
print(dict(fruit_counts))
# Output: {'apple': 7, 'banana': 3, 'orange': 1}

Custom Default Factory

You can provide any callable as the default factory:

python
from collections import defaultdict
 
# Use lambda for custom default values
page_views = defaultdict(lambda: {'views': 0, 'unique': 0})
 
page_views['home']['views'] = 100
page_views['home']['unique'] = 75
 
print(page_views['home'])
# Output: {'views': 100, 'unique': 75}
 
print(page_views['about'])  # New key gets default dictionary
# Output: {'views': 0, 'unique': 0}

Important Notes

Accessing vs. Checking for Keys:

python
from collections import defaultdict
 
counts = defaultdict(int)
 
# Accessing a missing key CREATES it
value = counts['missing']  # Creates 'missing' with value 0
print('missing' in counts)  # Output: True
 
# To check without creating, use 'in' or .get()
counts2 = defaultdict(int)
print('missing' in counts2)      # Output: False (doesn't create key)
print(counts2.get('missing'))    # Output: None (doesn't create key)

39.5) (Optional) Useful Iteration Tools

The itertools module provides functions for creating efficient iterators. These tools help you work with sequences in powerful ways without creating large intermediate lists.

39.5.1) Chaining Iterables with chain()

The chain() function combines multiple iterables into a single iterator that yields elements from each iterable in sequence.

What chain() accepts:

  • Multiple iterables (lists, tuples, strings, etc.) as separate arguments

What chain() returns:

  • An iterator that yields all elements from the first iterable, then all elements from the second, and so on

Key advantage:

  • More memory-efficient than concatenating with + (doesn't create intermediate lists)
  • Works with any iterable, not just lists
python
from itertools import chain
 
# Combine multiple lists
list1 = [1, 2, 3]
list2 = [4, 5, 6]
list3 = [7, 8, 9]
 
combined = chain(list1, list2, list3)
print(list(combined))  # Output: [1, 2, 3, 4, 5, 6, 7, 8, 9]

This is more memory-efficient than concatenating lists with +, especially for large sequences:

python
from itertools import chain
 
# Process multiple data sources without creating a large combined list
students_class_a = ["Alice", "Bob", "Charlie"]
students_class_b = ["Diana", "Eve", "Frank"]
students_class_c = ["Grace", "Henry", "Iris"]
 
# Iterate over all students without creating a combined list
for student in chain(students_class_a, students_class_b, students_class_c):
    print(f"Processing: {student}")
# Output:
# Processing: Alice
# Processing: Bob
# Processing: Charlie
# Processing: Diana
# Processing: Eve
# Processing: Frank
# Processing: Grace
# Processing: Henry
# Processing: Iris

You can chain different types of iterables:

python
from itertools import chain
 
# Chain lists, tuples, and strings
numbers = [1, 2, 3]
letters = ("a", "b", "c")
word = "xyz"
 
combined = chain(numbers, letters, word)
print(list(combined))  # Output: [1, 2, 3, 'a', 'b', 'c', 'x', 'y', 'z']

39.5.2) Repeating Elements with cycle()

The cycle() function creates an infinite iterator that repeatedly cycles through the elements of an iterable.

What cycle() accepts:

  • A single iterable (list, tuple, string, etc.)

What cycle() returns:

  • An infinite iterator that yields elements from the iterable repeatedly
  • After reaching the end, it starts over from the beginning

Key characteristics:

  • Creates an infinite iterator - never stops on its own
  • Must be used with a stopping condition (counter, break, or zip())
  • Memory-efficient: doesn't create copies of the data
python
from itertools import cycle
 
# Create an infinite cycle of colors
colors = cycle(["red", "green", "blue"])
 
# Take the first 10 colors
for i, color in enumerate(colors):
    if i >= 10:
        break
    print(f"Item {i}: {color}")
# Output:
# Item 0: red
# Item 1: green
# Item 2: blue
# Item 3: red
# Item 4: green
# Item 5: blue
# Item 6: red
# Item 7: green
# Item 8: blue
# Item 9: red

Warning: cycle() creates an infinite iterator. Always use it with a stopping condition (like a counter or break statement), or you'll create an infinite loop.

A practical use case is alternating between values:

python
from itertools import cycle
 
# Alternate between two background colors for table rows
row_colors = cycle(["white", "lightgray"])
 
rows = ["Row 1", "Row 2", "Row 3", "Row 4", "Row 5"]
for row, color in zip(rows, row_colors):
    print(f"{row}: background-color: {color}")
# Output:
# Row 1: background-color: white
# Row 2: background-color: lightgray
# Row 3: background-color: white
# Row 4: background-color: lightgray
# Row 5: background-color: white

Here we use zip() (which we learned about in Chapter 37) to pair each row with a color. The cycle() iterator automatically repeats the colors as needed.

39.5.3) Combining chain() and cycle()

You can combine itertools functions for more complex patterns:

python
from itertools import chain, cycle
 
# Create a pattern that cycles through multiple sequences
pattern1 = [1, 2, 3]
pattern2 = [10, 20]
 
# Chain the patterns, then cycle the result
combined_pattern = cycle(chain(pattern1, pattern2))
 
# Take the first 12 values
for i, value in enumerate(combined_pattern):
    if i >= 12:
        break
    print(value, end=" ")
# Output: 1 2 3 10 20 1 2 3 10 20 1 2
 
print()  # Newline

This creates a repeating pattern: 1, 2, 3, 10, 20, 1, 2, 3, 10, 20, ...

Here's a practical example creating a rotating schedule:

python
from itertools import cycle
 
# Create a rotating schedule for team members
team_members = ["Alice", "Bob", "Charlie"]
schedule = cycle(team_members)
 
# Assign tasks to team members in rotation
tasks = [
    "Review code",
    "Write tests",
    "Update documentation",
    "Fix bug #123",
    "Implement feature X",
    "Deploy to staging"
]
 
print("Task Assignments:")
for task, assignee in zip(tasks, schedule):
    print(f"  {assignee}: {task}")
# Output:
# Task Assignments:
#   Alice: Review code
#   Bob: Write tests
#   Charlie: Update documentation
#   Alice: Fix bug #123
#   Bob: Implement feature X
#   Charlie: Deploy to staging

itertools Module

chain: Combine iterables

cycle: Repeat forever

More memory-efficient than +

Works with different types

Creates infinite iterator

Always use with stopping condition

Useful for alternating patterns


In this chapter, we explored five essential standard library modules that extend Python's capabilities:

  • random: Generate random numbers, make random selections, and shuffle sequences—essential for simulations, games, and testing
  • datetime: Work with dates, times, and durations—calculate ages, schedule events, and format timestamps
  • json: Exchange data with other programs using the universal JSON format—save application state, work with web APIs, and store configuration
  • collections: Use specialized containers like Counter for counting and defaultdict for auto-creating keys
  • itertools: Create efficient iterators with chain() for combining sequences and cycle() for repeating patterns

These modules are part of Python's standard library—they're always available, well-tested, and solve common programming problems elegantly. As you build more complex programs, you'll find yourself reaching for these tools frequently. They represent Python's philosophy of "batteries included"—providing powerful, ready-to-use solutions for everyday programming tasks.

© 2025. Primesoft Co., Ltd.
support@primesoft.ai