34. Comprehensions: A Compact Way to Create Lists, Dictionaries, and Sets
Comprehensions are one of Python's most elegant features, allowing you to create and transform collections in a single, readable line of code. Instead of writing multiple lines with loops and append operations, comprehensions let you express the same logic more concisely and often more clearly.
In this chapter, we'll explore how to use list comprehensions, dictionary comprehensions, and set comprehensions to write more Pythonic code. We'll see how to incorporate conditional logic, when to choose comprehensions over traditional loops, and how to handle more complex scenarios with nested iterations.
34.1) List Comprehensions for Creating and Transforming Lists
34.1.1) The Basic List Comprehension Syntax
A list comprehension provides a compact way to create a new list by applying an expression to each item in an existing sequence. The basic syntax is:
[expression for item in iterable]This creates a new list where each element is the result of evaluating expression for each item in the iterable (any sequence you can loop over, like a list, range, or string).
Let's start with a simple example. Suppose we want to create a list of squares for numbers 0 through 4:
# Traditional approach with a loop
squares = []
for number in range(5):
squares.append(number ** 2)
print(squares) # Output: [0, 1, 4, 9, 16]With a list comprehension, we can express this more concisely:
# Using a list comprehension
squares = [number ** 2 for number in range(5)]
print(squares) # Output: [0, 1, 4, 9, 16]Both approaches produce the same result, but the comprehension is more compact and, once you're familiar with the syntax, often easier to read. The comprehension clearly shows that we're creating a list of squared values.
34.1.2) Transforming Existing Data
List comprehensions excel at transforming data from one form to another. Let's look at some practical examples.
Converting temperatures from Celsius to Fahrenheit:
# Temperature data in Celsius
celsius_temps = [0, 10, 20, 30, 40]
# Convert to Fahrenheit using the formula: F = C * 9/5 + 32
fahrenheit_temps = [temp * 9/5 + 32 for temp in celsius_temps]
print(fahrenheit_temps) # Output: [32.0, 50.0, 68.0, 86.0, 104.0]Converting strings to uppercase:
# Product codes in mixed case
product_codes = ["abc123", "def456", "ghi789"]
# Standardize to uppercase
uppercase_codes = [code.upper() for code in product_codes]
print(uppercase_codes) # Output: ['ABC123', 'DEF456', 'GHI789']34.1.3) Creating Lists from Range Objects
List comprehensions work naturally with range(), which we learned about in Chapter 12. This is useful for generating sequences with specific patterns:
# Generate even numbers from 0 to 10
evens = [n * 2 for n in range(6)] # n goes from 0 to 5, so n*2 gives 0, 2, 4, 6, 8, 10
print(evens) # Output: [0, 2, 4, 6, 8, 10]
# Generate multiples of 5
multiples_of_five = [n * 5 for n in range(1, 6)]
print(multiples_of_five) # Output: [5, 10, 15, 20, 25]34.1.4) Comprehensions vs Building Lists with Append
It's important to understand that list comprehensions create the entire list in one operation, whereas the traditional loop approach builds the list incrementally. Both produce the same result, but comprehensions are generally faster for creating new lists and are considered more Pythonic.
Here's a side-by-side comparison:
# Traditional loop approach
result = []
for i in range(5):
result.append(i * 3)
print(result) # Output: [0, 3, 6, 9, 12]
# List comprehension approach
result = [i * 3 for i in range(5)]
print(result) # Output: [0, 3, 6, 9, 12]Both approaches are valid, but the comprehension is more concise and clearly expresses the intent: "create a list of values where each value is i * 3."
34.2) Conditional Logic Inside List Comprehensions
34.2.1) Filtering with if Conditions
One of the most powerful features of list comprehensions is the ability to filter items based on a condition. You can add an if clause at the end of the comprehension to include only items that meet certain criteria:
[expression for item in iterable if condition]The if clause acts as a filter: Python evaluates the condition for each item, and only items for which the condition is True will be included in the resulting list. Items that don't meet the condition are skipped entirely.
Let's see this in action with a simple example:
# Get only even numbers from 0 to 9
numbers = range(10)
evens = [n for n in numbers if n % 2 == 0]
print(evens) # Output: [0, 2, 4, 6, 8]Here, n % 2 == 0 checks if a number is even. Only numbers that pass this test are included in the new list.
Filtering student scores:
# Student test scores
scores = [45, 78, 92, 65, 88, 55, 73, 95]
# Get only passing scores (>= 70)
passing_scores = [score for score in scores if score >= 70]
print(passing_scores) # Output: [78, 92, 88, 73, 95]34.2.2) Transforming Filtered Items
You can combine filtering with transformation by applying an expression to the filtered items:
# Student scores
scores = [45, 78, 92, 65, 88, 55, 73, 95]
# Get passing scores and scale them to 0-10 range
scaled_passing = [score / 10 for score in scores if score >= 70]
print(scaled_passing) # Output: [7.8, 9.2, 8.8, 7.3, 9.5]
# First filters (keeps only >= 70), then transforms (divides by 10)Converting and filtering strings:
# Product names with mixed quality
products = ["apple", "BANANA", "cherry", "DATE", "elderberry"]
# Get uppercase versions of products with names longer than 5 characters
long_products_upper = [product.upper() for product in products if len(product) > 5]
print(long_products_upper) # Output: ['BANANA', 'CHERRY', 'ELDERBERRY']34.2.3) Using Conditional Expressions (if-else) in Comprehensions
Sometimes you want to transform items differently based on a condition, rather than filtering them out. For this, you use a conditional expression (which we learned in Chapter 10) in the expression part of the comprehension:
[expression_if_true if condition else expression_if_false for item in iterable]This is different from filtering. Here, every item is included in the result—the if-else determines which expression to apply to each item. The conditional expression (from Chapter 10) appears in the expression part, before the for clause.
Note the syntax difference:
- Filtering:
[expr for item in seq if condition]-ifat the end, noelse - Conditional expression:
[expr_if if cond else expr_else for item in seq]-if-elsein the expression, beforefor
# Classify numbers as even or odd
numbers = range(6)
classifications = ["even" if n % 2 == 0 else "odd" for n in numbers]
print(classifications) # Output: ['even', 'odd', 'even', 'odd', 'even', 'odd']Applying different transformations based on conditions:
# Student scores
scores = [45, 78, 92, 65, 88, 55, 73, 95]
# Add bonus points to failing scores, keep passing scores as-is
adjusted_scores = [score + 10 if score < 70 else score for score in scores]
print(adjusted_scores) # Output: [55, 78, 92, 75, 88, 65, 73, 95]In both examples, notice:
- Every item from the original list appears in the result
- The
if-elsedetermines what value each item becomes - No items are filtered out
34.2.4) Understanding the Difference: Filtering vs Conditional Expression
It's crucial to understand the difference between these two patterns:
Filtering (if at the end) - Some items are excluded:
# Only include positive numbers
numbers = [-2, 5, -1, 8, 0, 3]
positives = [n for n in numbers if n > 0]
print(positives) # Output: [5, 8, 3]
print(len(positives)) # Output: 3 (only 3 items)
# Process: Check condition → If True, include item → If False, skip itemConditional expression (if-else in the expression) - All items are included but transformed differently:
# Convert negative numbers to zero, keep positive numbers
numbers = [-2, 5, -1, 8, 0, 3]
non_negatives = [n if n > 0 else 0 for n in numbers]
print(non_negatives) # Output: [0, 5, 0, 8, 0, 3]
print(len(non_negatives)) # Output: 6 (all 6 items)
# Process: Check condition → If True, use first expr → If False, use second expr → Always include result34.3) Dictionary Comprehensions
34.3.1) Basic Dictionary Comprehension Syntax
Just as list comprehensions create lists, dictionary comprehensions create dictionaries. The syntax is similar, but you specify both a key and a value:
{key_expression: value_expression for item in iterable}This creates a new dictionary where each key-value pair is generated from the iterable.
Let's start with a simple example that creates a dictionary mapping numbers to their squares:
# Create a dictionary of numbers and their squares
squares_dict = {n: n ** 2 for n in range(5)}
print(squares_dict) # Output: {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}Creating a dictionary from two lists:
# Student names and their scores
names = ["Alice", "Bob", "Charlie"]
scores = [85, 92, 78]
# Create a dictionary mapping names to scores
student_scores = {names[i]: scores[i] for i in range(len(names))}
print(student_scores) # Output: {'Alice': 85, 'Bob': 92, 'Charlie': 78}A more elegant way to combine two sequences is using zip(), which we'll learn about in Chapter 37. For now, the index-based approach works well.
34.3.2) Transforming Existing Dictionaries
Dictionary comprehensions are excellent for transforming existing dictionaries. You can modify keys, values, or both.
When iterating over a dictionary in a comprehension, use .items() to access both keys and values. The .items() method returns key-value pairs that you can unpack in the for clause:
# Original prices in dollars
prices = {"apple": 1.50, "banana": 0.75, "cherry": 2.00}
# Convert to cents (multiply by 100)
prices_in_cents = {fruit: price * 100 for fruit, price in prices.items()}
print(prices_in_cents) # Output: {'apple': 150.0, 'banana': 75.0, 'cherry': 200.0}Transforming keys:
# Product codes in lowercase
codes = {"abc": 100, "def": 200, "ghi": 300}
# Convert keys to uppercase
uppercase_codes = {code.upper(): quantity for code, quantity in codes.items()}
print(uppercase_codes) # Output: {'ABC': 100, 'DEF': 200, 'GHI': 300}Transforming both keys and values:
# Student names and scores
scores = {"alice": 85, "bob": 92, "charlie": 78}
# Capitalize names and scale scores to 0-10 range
formatted_scores = {name.capitalize(): score / 10 for name, score in scores.items()}
print(formatted_scores) # Output: {'Alice': 8.5, 'Bob': 9.2, 'Charlie': 7.8}34.3.3) Filtering Dictionary Items
Like list comprehensions, dictionary comprehensions can include conditions to filter items:
# Student scores
scores = {"Alice": 85, "Bob": 65, "Charlie": 92, "David": 55, "Eve": 78}
# Get only passing scores (>= 70)
passing_scores = {name: score for name, score in scores.items() if score >= 70}
print(passing_scores) # Output: {'Alice': 85, 'Charlie': 92, 'Eve': 78}Filtering by key characteristics:
# Product inventory
inventory = {"apple": 50, "banana": 30, "apricot": 20, "cherry": 40}
# Get only products starting with 'a'
a_products = {product: quantity for product, quantity in inventory.items()
if product.startswith('a')}
print(a_products) # Output: {'apple': 50, 'apricot': 20}34.3.4) Creating Dictionaries from Sequences
Dictionary comprehensions are useful for creating lookup dictionaries from sequences:
# List of words
words = ["python", "java", "ruby", "javascript"]
# Create a dictionary mapping each word to its length
word_lengths = {word: len(word) for word in words}
print(word_lengths) # Output: {'python': 6, 'java': 4, 'ruby': 4, 'javascript': 10}34.3.5) Using Conditional Expressions in Dictionary Comprehensions
You can use conditional expressions to compute values differently based on conditions:
# Student scores
scores = {"Alice": 85, "Bob": 65, "Charlie": 92, "David": 55}
# Add "Pass" or "Fail" status
scores_with_status = {name: "Pass" if score >= 70 else "Fail"
for name, score in scores.items()}
print(scores_with_status) # Output: {'Alice': 'Pass', 'Bob': 'Fail', 'Charlie': 'Pass', 'David': 'Fail'}Applying different transformations:
# Product prices
prices = {"apple": 1.50, "banana": 0.75, "cherry": 2.50}
# Apply discount to expensive items (> $2.00)
discounted_prices = {product: price * 0.9 if price > 2.00 else price
for product, price in prices.items()}
print(discounted_prices) # Output: {'apple': 1.5, 'banana': 0.75, 'cherry': 2.25}34.4) Set Comprehensions
34.4.1) Basic Set Comprehension Syntax
Set comprehensions create sets using syntax similar to list comprehensions, but with curly braces:
{expression for item in iterable}The result is a set, which means duplicate values are automatically removed and the order is not guaranteed.
# Create a set of squares
squares_set = {n ** 2 for n in range(6)}
print(squares_set) # Output: {0, 1, 4, 9, 16, 25}The key difference from list comprehensions is that sets automatically eliminate duplicates:
# List comprehension - keeps duplicates
numbers = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
squared_list = [n ** 2 for n in numbers]
print(squared_list) # Output: [1, 4, 4, 9, 9, 9, 16, 16, 16, 16]
# Set comprehension - removes duplicates
squared_set = {n ** 2 for n in numbers}
print(squared_set) # Output: {16, 1, 4, 9} (order may vary)Note that the set output order may differ from what you see here. Sets are unordered collections, so Python may display the elements in any order.
34.4.2) Extracting Unique Values
Set comprehensions are perfect when you need to extract unique values from a collection:
# Student responses (with duplicates)
responses = ["yes", "no", "yes", "maybe", "no", "yes", "maybe"]
# Get unique responses
unique_responses = {response for response in responses}
print(unique_responses) # Output: {'maybe', 'yes', 'no'}Extracting unique characters from strings:
# Text with repeated characters
text = "mississippi"
# Get unique characters
unique_chars = {char for char in text}
print(unique_chars) # Output: {'m', 'i', 's', 'p'}34.4.3) Transforming and Filtering with Set Comprehensions
Like other comprehensions, set comprehensions can include transformations and conditions:
# Student names
names = ["Alice", "bob", "CHARLIE", "david", "EVE"]
# Get unique first letters in uppercase
first_letters = {name[0].upper() for name in names}
print(first_letters) # Output: {'A', 'B', 'C', 'D', 'E'}Filtering with conditions:
# Numbers with duplicates
numbers = [1, -2, 3, -4, 5, -2, 3, 6, -4]
# Get unique positive numbers
positive_numbers = {n for n in numbers if n > 0}
print(positive_numbers) # Output: {1, 3, 5, 6}34.4.4) When Set Comprehensions Are Most Useful
Set comprehensions are particularly valuable when:
- You need unique values: Automatically removes duplicates
- Order doesn't matter: Sets are unordered, so use them when sequence isn't important
- You'll perform set operations: The result can be used with union, intersection, etc. (as we learned in Chapter 17)
# Student enrollments in two courses
course_a = ["Alice", "Bob", "Charlie", "David"]
course_b = ["Charlie", "David", "Eve", "Frank"]
# Get unique students across both courses using set comprehension
all_students = {student for course in [course_a, course_b] for student in course}
print(all_students) # Output: {'Alice', 'Bob', 'Charlie', 'David', 'Eve', 'Frank'}34.5) Choosing Comprehensions vs Loops
34.5.1) When Comprehensions Are Better
Comprehensions are generally preferred when you're creating a new collection by transforming or filtering an existing one. They're more concise, often more readable, and typically faster than equivalent loops.
Comprehensions excel when:
- Creating a new collection from an existing one:
# Good use of comprehension
prices = [10.99, 25.50, 8.75, 15.00]
discounted = [price * 0.9 for price in prices]- The transformation is straightforward:
# Clear and concise
names = ["alice", "bob", "charlie"]
uppercase_names = [name.upper() for name in names]- Filtering based on simple conditions:
# Easy to understand
scores = [85, 92, 78, 65, 88, 55, 73, 95]
passing = [score for score in scores if score >= 70]34.5.2) When Traditional Loops Are Better
However, there are situations where traditional loops are more appropriate and readable:
Use loops when:
- The logic is complex or involves multiple steps:
# Too complex for a comprehension
results = []
for score in scores:
if score >= 90:
grade = "A"
elif score >= 80:
grade = "B"
elif score >= 70:
grade = "C"
else:
grade = "F"
results.append({"score": score, "grade": grade})While you could write this as a comprehension, it would be much harder to read.
- You need to perform actions beyond creating a collection:
# Loop is clearer when performing I/O or side effects
for filename in files:
with open(filename) as f:
content = f.read()
print(f"Processing {filename}")
# ... more processing- You need to modify an existing collection in place:
# Modifying a list in place - can't use comprehension
numbers = [1, 2, 3, 4, 5]
for i in range(len(numbers)):
numbers[i] *= 2
print(numbers) # Output: [2, 4, 6, 8, 10]- You need to use break or continue with complex logic:
# Finding first occurrence with additional processing
found = None
for item in items:
if item.startswith("target"):
found = item
print(f"Found: {found}")
break34.5.3) Readability Considerations
The most important factor is readability. If a comprehension becomes too long or complex, break it into a traditional loop:
# Hard to read - too much happening in one line
result = [item.upper().strip() for item in items if len(item) > 5 and item.startswith('a')]
# Better - use a loop when logic is complex
result = []
for item in items:
if len(item) > 5 and item.startswith('a'):
cleaned = item.strip().upper()
result.append(cleaned)A good rule of thumb: If your comprehension doesn't fit comfortably on one line (or at most two lines with clear formatting), consider using a loop instead.
34.5.4) Performance Considerations
Comprehensions are generally faster than equivalent loops because they're optimized at the interpreter level. However, this performance difference is usually negligible for small to medium-sized collections.
# Both produce the same result
# Comprehension is slightly faster
squares_comp = [n ** 2 for n in range(1000)]
# Loop is slightly slower but more flexible
squares_loop = []
for n in range(1000):
squares_loop.append(n ** 2)For most practical purposes, choose based on readability rather than performance. Only optimize for speed if profiling shows that a particular operation is a bottleneck.
34.5.5) Combining Approaches
Sometimes the best solution combines both approaches:
# Use comprehension for simple transformation
student_data = [
{"name": "Alice", "score": 85},
{"name": "Bob", "score": 92},
{"name": "Charlie", "score": 78}
]
# Extract scores with comprehension
scores = [student["score"] for student in student_data]
# Use loop for complex processing
for student in student_data:
score = student["score"]
if score >= 90:
print(f"{student['name']}: Excellent!")
elif score >= 80:
print(f"{student['name']}: Good job!")
else:
print(f"{student['name']}: Keep working!")34.6) Nested Loops and Multiple for Clauses
34.6.1) Understanding Multiple for Clauses
Comprehensions can include multiple for clauses, which is equivalent to nested loops. The syntax is:
[expression for item1 in iterable1 for item2 in iterable2]This is equivalent to:
result = []
for item1 in iterable1:
for item2 in iterable2:
result.append(expression)The key point is that the for clauses are read left to right, just like nested loops are written top to bottom.
Let's start with a simple example that creates all combinations of two lists:
# Two lists of values
colors = ["red", "blue"]
sizes = ["S", "M", "L"]
# Create all combinations
combinations = [(color, size) for color in colors for size in sizes]
print(combinations)
# Output: [('red', 'S'), ('red', 'M'), ('red', 'L'), ('blue', 'S'), ('blue', 'M'), ('blue', 'L')]This creates every possible pairing of a color with a size.
34.6.2) Creating Coordinate Pairs
A common use case is generating coordinate pairs:
# Create a 3x3 grid of coordinates
coordinates = [(x, y) for x in range(3) for y in range(3)]
print(coordinates)
# Output: [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]Creating a multiplication table:
# Generate multiplication pairs
products = [(x, y, x * y) for x in range(1, 4) for y in range(1, 4)]
for x, y, product in products:
print(f"{x} × {y} = {product}")
# Output:
# 1 × 1 = 1
# 1 × 2 = 2
# 1 × 3 = 3
# 2 × 1 = 2
# 2 × 2 = 4
# 2 × 3 = 6
# 3 × 1 = 3
# 3 × 2 = 6
# 3 × 3 = 934.6.3) Flattening Nested Lists
Multiple for clauses are useful for flattening nested structures:
# Nested list of numbers
nested_numbers = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
# Flatten into a single list
flat = [num for sublist in nested_numbers for num in sublist]
print(flat) # Output: [1, 2, 3, 4, 5, 6, 7, 8, 9]This is equivalent to:
flat = []
for sublist in nested_numbers:
for num in sublist:
flat.append(num)Flattening a list of words into characters:
# List of words
words = ["cat", "dog", "bird"]
# Get all characters from all words
all_chars = [char for word in words for char in word]
print(all_chars) # Output: ['c', 'a', 't', 'd', 'o', 'g', 'b', 'i', 'r', 'd']34.6.4) Adding Conditions to Nested Comprehensions
You can add conditions to filter the results:
# Create pairs where the sum is even
pairs = [(x, y) for x in range(5) for y in range(5) if (x + y) % 2 == 0]
print(pairs)
# Output: [(0, 0), (0, 2), (0, 4), (1, 1), (1, 3), (2, 0), (2, 2), (2, 4), (3, 1), (3, 3), (4, 0), (4, 2), (4, 4)]Finding common elements between lists:
# Two lists of numbers
list1 = [1, 2, 3, 4, 5]
list2 = [4, 5, 6, 7, 8]
# Find pairs where values are equal (common elements)
common = [x for x in list1 for y in list2 if x == y]
print(common) # Output: [4, 5]Note: For finding common elements, using set intersection is more efficient: set(list1) & set(list2), which we learned in Chapter 17.
34.6.5) Nested Dictionary Comprehensions
You can also use multiple for clauses in dictionary comprehensions:
# Create a dictionary of coordinate sums
coord_sums = {(x, y): x + y for x in range(3) for y in range(3)}
print(coord_sums)
# Output: {(0, 0): 0, (0, 1): 1, (0, 2): 2, (1, 0): 1, (1, 1): 2, (1, 2): 3, (2, 0): 2, (2, 1): 3, (2, 2): 4}34.6.6) When to Avoid Nested Comprehensions
While nested comprehensions are powerful, they can quickly become hard to read. Consider these guidelines:
Acceptable - relatively simple:
# Two levels of nesting, simple expression
matrix = [[i * j for j in range(3)] for i in range(3)]
print(matrix) # Output: [[0, 0, 0], [0, 1, 2], [0, 2, 4]]Getting complex - consider a loop:
# Three levels of nesting - hard to read
result = [[[i + j + k for k in range(2)] for j in range(2)] for i in range(2)]
# Better as nested loops for clarityRule of thumb: If you have more than two for clauses, or if the expression is complex, use traditional nested loops instead:
# Clearer with explicit loops
result = []
for i in range(2):
middle = []
for j in range(2):
inner = []
for k in range(2):
inner.append(i + j + k)
middle.append(inner)
result.append(middle)Comprehensions with multiple for clauses are powerful tools, but remember: clarity is more important than brevity. If a nested comprehension becomes difficult to understand, it's better to use explicit loops.