15. 元组与范围：简单的不可变序列

在第 14 章中，我们探索了列表(list)——Python 多才多艺、可变的序列(sequence)类型。现在我们将研究另外两种重要的序列类型：元组(tuple) 和 范围(range)。列表擅长存储随时间变化的集合，而元组提供不可变序列来保护数据不被修改，范围则提供一种内存高效的方式来表示数字序列。

理解何时使用每一种序列类型，会让你的程序更高效、更安全，并且意图更清晰。到本章结束时，你将知道如何有效地使用元组和范围，并且理解适用于所有 Python 序列类型的常见操作。

15.1) 创建与使用元组（逗号的重要性）

元组(tuple) 是一种有序、不可变的项目序列。与列表一样，元组可以包含任何类型的数据，并保持元素顺序。然而，不同于列表的是，一旦创建了元组，你就无法修改其内容。

使用圆括号创建元组

创建元组最常见的方式是用圆括号包裹以逗号分隔的值：

python

# 学生测试成绩的元组
scores = (85, 92, 78, 95)
print(scores)  # Output: (85, 92, 78, 95)
print(type(scores))  # Output: <class 'tuple'>
 
# 混合数据类型的元组
student_info = ("Alice", 20, "Computer Science", 3.8)
print(student_info)  # Output: ('Alice', 20, 'Computer Science', 3.8)
 
# 空元组
empty = ()
print(empty)  # Output: ()
print(len(empty))  # Output: 0

元组使用圆括号 () 作为字面量语法，而列表使用方括号 []。这种视觉区分能帮助你立刻识别自己正在处理哪种类型。

创建元组的是逗号，而不是圆括号

这里有一个让许多初学者惊讶的关键细节：真正创建元组的是逗号，而不是圆括号。圆括号往往是可选的，主要用于让元组更醒目，或在表达式中进行分组。

python

# 这些都会创建相同的元组
coordinates_1 = (10, 20)
coordinates_2 = 10, 20  # 不需要圆括号！
print(coordinates_1)  # Output: (10, 20)
print(coordinates_2)  # Output: (10, 20)
print(coordinates_1 == coordinates_2)  # Output: True
 
# 重要的是逗号
x = (42)  # 这只是把整数 42 放在圆括号里
y = (42,)  # 这是一个包含单个元素的元组
print(type(x))  # Output: <class 'int'>
print(type(y))  # Output: <class 'tuple'>
print(y)  # Output: (42,)

(42) 中的圆括号只是分组括号，类似数学表达式里的括号。要创建单元素元组，你必须加上尾随逗号：(42,)。这个逗号告诉 Python 你想要的是一个元组，而不仅仅是一个被分组的表达式。

何时必须使用圆括号

虽然逗号创建元组，但在某些情况下必须使用圆括号来避免歧义：

python

# 不加圆括号会显得很混乱
def get_dimensions():
    return 1920, 1080  # Returns a tuple
 
width, height = get_dimensions()
print(f"Screen: {width}x{height}")  # Output: Screen: 1920x1080
 
# 作为函数参数传递元组时需要圆括号
print((1, 2, 3))  # Output: (1, 2, 3)
# 不加圆括号，Python 会把它看成三个独立的参数
 
# 复杂表达式中需要圆括号
result = (10, 20) + (30, 40)  # Tuple concatenation
print(result)  # Output: (10, 20, 30, 40)

创建单元素元组

单元素元组必须带尾随逗号，这一点经常让初学者措手不及：

python

# 常见错误：忘记逗号
not_a_tuple = ("Python")
print(type(not_a_tuple))  # Output: <class 'str'>
print(not_a_tuple)  # Output: Python
 
# 正确：包含尾随逗号
is_a_tuple = ("Python",)
print(type(is_a_tuple))  # Output: <class 'tuple'>
print(is_a_tuple)  # Output: ('Python',)
 
# 即使不写圆括号，逗号也有效
also_a_tuple = "Python",
print(type(also_a_tuple))  # Output: <class 'tuple'>
print(also_a_tuple)  # Output: ('Python',)

为什么 Python 要求这种看起来有点别扭的语法？因为圆括号在 Python 中还有另一个含义——对表达式进行分组。如果没有逗号，Python 没办法区分 (42) 是一个被分组的数字，还是一个元组。

访问元组元素

元组支持与列表相同的索引(indexing)与切片(slicing)操作：

python

# 学生信息元组
student = ("Bob", 22, "Physics", 3.6)
 
# 访问单个元素（从 0 开始索引）
name = student[0]
age = student[1]
major = student[2]
gpa = student[3]
 
print(f"{name} is {age} years old")  # Output: Bob is 22 years old
print(f"Major: {major}, GPA: {gpa}")  # Output: Major: Physics, GPA: 3.6
 
# 也支持负索引
last_item = student[-1]
print(f"Last item: {last_item}")  # Output: Last item: 3.6
 
# 切片会提取出一个新的元组
first_two = student[:2]
print(first_two)  # Output: ('Bob', 22)
print(type(first_two))  # Output: <class 'tuple'>

你在第 14 章学到的所有列表索引与切片技巧，在元组上都完全相同。关键区别在于：元组一旦创建就不能修改。

15.2) 元组打包与解包

元组最强大且优雅的特性之一，是能够把多个值打包在一起，并将它们解包到独立变量中。这个特性让 Python 代码非常简洁且可读。

元组打包

当你把多个值用逗号分隔放在一起时，就发生了元组打包(tuple packing)：

python

# 将值打包到一个元组中
coordinates = 10, 20, 30
print(coordinates)  # Output: (10, 20, 30)
 
# 打包不同类型
user_data = "Alice", 25, "alice@example.com"
print(user_data)  # Output: ('Alice', 25, 'alice@example.com')
 
# 打包函数返回值
def get_statistics(numbers):
    total = sum(numbers)
    count = len(numbers)
    average = total / count
    return total, count, average  # Packs three values into a tuple
 
stats = get_statistics([85, 90, 78, 92, 88])
print(stats)  # Output: (433, 5, 86.6)

当函数返回多个由逗号分隔的值时，Python 会自动将它们打包成一个元组。这就是为什么函数看起来像是在返回多个值——实际上它返回的是一个包含这些值的单一元组。

元组解包

元组解包(tuple unpacking) 是相反的过程：把元组中的值提取到多个独立变量中：

python

# 基本解包
point = (100, 200)
x, y = point
print(f"x = {x}, y = {y}")  # Output: x = 100, y = 200
 
# 解包适用于任何序列，而不仅仅是元组
name, age, email = ["Bob", 30, "bob@example.com"]
print(f"{name} is {age} years old")  # Output: Bob is 30 years old
 
# 直接解包函数返回值
total, count, average = get_statistics([95, 88, 92, 85])
print(f"Average of {count} scores: {average}")  # Output: Average of 4 scores: 90.0

左侧变量的数量必须与序列中的元素数量一致。如果不匹配，Python 会抛出 ValueError：

python

# 这会导致错误
coordinates = (10, 20, 30)
# x, y = coordinates  # ValueError: too many values to unpack (expected 2)
 
# 这也会导致错误
point = (5, 10)
# x, y, z = point  # ValueError: not enough values to unpack (expected 3, got 2)

使用元组解包交换变量

元组解包提供了一种优雅的方式来交换变量值，而不需要临时变量：

python

# 传统方式：使用临时变量交换
a = 10
b = 20
temp = a
a = b
b = temp
print(f"a = {a}, b = {b}")  # Output: a = 20, b = 10
 
# Python 的优雅方式：使用元组解包交换
x = 100
y = 200
x, y = y, x  # 一行完成交换！
print(f"x = {x}, y = {y}")  # Output: x = 200, y = 100
 
# 交换两个以上变量
first = "A"
second = "B"
third = "C"
first, second, third = third, first, second
print(first, second, third)  # Output: C A B

这是如何工作的？Python 会先计算右侧，创建一个元组 (y, x)，然后将其解包到左侧变量中。这个过程一步完成，因此不需要临时变量。

使用星号运算符进行扩展解包

Python 提供了使用 * 运算符的扩展解包(extended unpacking)，用于捕获多个元素：

python

# 使用“剩余”变量解包
scores = (95, 88, 92, 85, 90, 87)
first, second, *rest = scores
print(f"Top two: {first}, {second}")  # Output: Top two: 95, 88
print(f"Others: {rest}")  # Output: Others: [92, 85, 90, 87]
print(type(rest))  # Output: <class 'list'>
 
# 星号可以出现在任意位置
numbers = (1, 2, 3, 4, 5)
first, *middle, last = numbers
print(f"First: {first}")  # Output: First: 1
print(f"Middle: {middle}")  # Output: Middle: [2, 3, 4]
print(f"Last: {last}")  # Output: Last: 5
 
# 捕获开头部分
*beginning, second_last, last = numbers
print(f"Beginning: {beginning}")  # Output: Beginning: [1, 2, 3]
print(f"Last two: {second_last}, {last}")  # Output: Last two: 4, 5

注意，带星号的变量总是以列表(list) 的形式捕获元素，即使你是在从元组解包。如果没有元素可捕获，带星号的变量就会变成空列表：

python

# 没有元素可捕获时
a, b, *rest = (10, 20)
print(rest)  # Output: []
 
# 每次解包只能有一个星号
# first, *middle, *end = (1, 2, 3, 4)  # SyntaxError: multiple starred expressions

使用下划线忽略值

有时你只需要元组中的某些值。按照约定，Python 程序员使用下划线 _ 作为变量名，表示要忽略的值：

python

# 解析日期字符串
date_string = "2024-03-15"
year, month, day = date_string.split("-")
print(f"Month: {month}")  # Output: Month: 03
 
# 如果我们只关心月份
_, month, _ = date_string.split("-")
print(f"Month: {month}")  # Output: Month: 03
 
# 配合扩展解包
data = ("Alice", 25, "Engineer", "New York", "alice@example.com")
name, age, *_, email = data
print(f"{name} ({age}): {email}")  # Output: Alice (25): alice@example.com

下划线只是一个普通变量名，但使用它能向其他程序员（以及你自己）表明你是在有意忽略那些值。

打包与解包的实用示例

python

# 从计算中返回多个值
def calculate_rectangle_properties(width, height):
    """Calculate area and perimeter of a rectangle."""
    area = width * height
    perimeter = 2 * (width + height)
    return area, perimeter  # Packing
 
# 解包结果
rect_area, rect_perimeter = calculate_rectangle_properties(5, 3)
print(f"Area: {rect_area}, Perimeter: {rect_perimeter}")  # Output: Area: 15, Perimeter: 16
 
# 在迭代中解包
students = [
    ("Alice", 85),
    ("Bob", 92),
    ("Carol", 78)
]
 
for name, score in students:  # Unpacking in the loop
    print(f"{name}: {score}")
# Output:
# Alice: 85
# Bob: 92
# Carol: 78

元组打包与解包让 Python 代码更易读、更具表达力。你不必通过索引（student[0]、student[1]）来访问元组元素，而是可以把它们解包到有意义的变量名中。

15.3) 元组是不可变的：这在何时有用

元组的决定性特征是它的不可变性(immutability)——一旦创建，元组的内容就不能被改变。你不能添加、删除或修改元素。这种不可变性看起来像一种限制，但它带来重要好处。

不可变性在实践中的含义

python

# 创建一个元组
coordinates = (10, 20, 30)
print(coordinates)  # Output: (10, 20, 30)
 
# 尝试修改会抛出错误
# coordinates[0] = 15  # TypeError: 'tuple' object does not support item assignment
 
# 尝试添加元素会抛出错误
# coordinates.append(40)  # AttributeError: 'tuple' object has no attribute 'append'
 
# 尝试删除元素会抛出错误
# del coordinates[1]  # TypeError: 'tuple' object doesn't support item deletion

当 Python 说元组不支持 item assignment 时，意思是你不能更改元组中任意位置存储的内容。元组的结构在创建时就固定了。

对比可变列表与不可变元组

python

# 列表是可变的——你可以修改它
shopping_list = ["milk", "bread", "eggs"]
shopping_list[1] = "butter"  # 修改元素
shopping_list.append("cheese")  # 添加元素
print(shopping_list)  # Output: ['milk', 'butter', 'eggs', 'cheese']
 
# 元组是不可变的——你不能修改它
product_dimensions = (10, 20, 5)  # width, height, depth in cm
# product_dimensions[0] = 12  # TypeError: cannot modify
# product_dimensions.append(3)  # AttributeError: no append method
 
# 要“改变”一个元组，必须创建一个新的元组
new_dimensions = (12, 20, 5)  # 创建一个全新的元组
print(new_dimensions)  # Output: (12, 20, 5)

为什么不可变性有用

不可变性带来了多个实用好处：

1. 数据完整性与安全性

当你把元组传给一个函数时，你知道该函数无法意外修改你的数据：

python

def calculate_distance(point1, point2):
    """Calculate distance between two 2D points."""
    x1, y1 = point1
    x2, y2 = point2
 
    dx = x2 - x1
    dy = y2 - y1
    
    # Even if we wanted to, we can't modify the input tuples
 
    return (dx**2 + dy**2) ** 0.5
 
start = (0, 0)
end = (3, 4)
distance = calculate_distance(start, end)
print(f"Distance: {distance}")  # Output: Distance: 5.0
print(f"Start point unchanged: {start}")  # Output: Start point unchanged: (0, 0)

如果使用列表，你就需要担心函数是否会修改你的数据。使用元组，你可以保证它不会。

2. 将元组用作字典键

正如我们将在第 17 章中更深入探讨的那样，字典(dictionary)键必须是可哈希的(hashable)——它们必须拥有永不变化的哈希值。像元组这样的不可变对象可以作为字典键；像列表这样的可变对象则不行：

python

# 元组可以作为字典键
locations = {
    (0, 0): "Origin",
    (10, 20): "Point A",
    (30, 40): "Point B"
}
print(locations[(10, 20)])  # Output: Point A
 
# 列表不能作为字典键
# locations_bad = {
#     [0, 0]: "Origin"  # TypeError: unhashable type: 'list'
# }

3. 表达意图

使用元组而不是列表，能向其他程序员（以及你自己）传达：这些数据不应该改变：

python

# RGB 颜色值——这些不应该改变
RED = (255, 0, 0)
GREEN = (0, 255, 0)
BLUE = (0, 0, 255)
 
# 数据库连接参数——固定配置
DB_CONFIG = ("localhost", 5432, "myapp", "production")
 
# 地理坐标——位置不会改变
EIFFEL_TOWER = (48.8584, 2.2945)  # latitude, longitude

当你在代码里看到元组，你会立刻知道这些数据旨在保持不变。当你看到列表，你会知道它可能会被修改。

4. 性能优势

因为元组不可变，Python 可以以列表无法实现的方式对其进行优化。我们会在第 27 章学习 sys 模块，但目前你只需要知道 sys.getsizeof() 可以告诉我们一个对象占用多少内存：

python

import sys
 
# 元组比等价列表占用更少内存
tuple_data = (1, 2, 3, 4, 5)
list_data = [1, 2, 3, 4, 5]
 
print(f"Tuple size: {sys.getsizeof(tuple_data)} bytes")  # Output: Tuple size: 80 bytes (may vary by Python version)
print(f"List size: {sys.getsizeof(list_data)} bytes")    # Output: List size: 104 bytes (may vary by Python version)
 
# 创建元组更快
import timeit
 
tuple_time = timeit.timeit("(1, 2, 3, 4, 5)", number=1000000)
list_time = timeit.timeit("[1, 2, 3, 4, 5]", number=1000000)
 
print(f"Tuple creation: {tuple_time:.4f} seconds")
print(f"List creation: {list_time:.4f} seconds")
# Example output: Tuple creation: 0.0055 seconds, List creation: 0.0292 seconds

15.4) 不可变性的陷阱：当元组包含可变项时

虽然元组本身是不可变的，但它可以包含列表或字典等可变对象。这会产生一个微妙但重要的区别：元组的结构固定，但其内部可变对象的内容仍然可以改变。

理解这种区别

python

# 一个包含列表的元组
student_data = ("Alice", 20, [85, 90, 78])  # name, age, scores
print(student_data)  # Output: ('Alice', 20, [85, 90, 78])
 
# 我们不能重新赋值元组元素
# student_data[0] = "Bob"  # TypeError: 'tuple' object does not support item assignment
 
# 但我们可以修改元组内部的列表
student_data[2].append(92)  # 添加新成绩
print(student_data)  # Output: ('Alice', 20, [85, 90, 78, 92])
 
student_data[2][0] = 88  # 修改已有成绩
print(student_data)  # Output: ('Alice', 20, [88, 90, 78, 92])

这里发生了什么？该元组存储了三个引用：一个指向字符串 "Alice"，一个指向整数 20，一个指向列表对象。元组的结构——它引用哪些对象——不能改变。但列表对象本身是可变的，所以它的内容可以改变。

可视化这种差异

python

# 元组结构是固定的
data = ("Python", [1, 2, 3])
 
# 这试图改变元组引用的对象——不允许
# data[1] = [4, 5, 6]  # TypeError
 
# 这修改了元组所引用的列表——允许
data[1].append(4)
print(data)  # Output: ('Python', [1, 2, 3, 4])
 
# 元组仍然引用同一个列表对象
# 变化的是列表内容，而不是元组指向哪个列表

可以这样理解：元组像一排盒子，每个盒子里放着指向某个对象的引用。盒子本身被锁定（不可变），但如果盒子里引用的是可变对象，那么该对象仍然可以变化。

元组中包含字典

同样的原则也适用于元组中的字典：

python

# 包含字典的元组
user_profile = ("alice", {"email": "alice@example.com", "age": 25})
print(user_profile)  # Output: ('alice', {'email': 'alice@example.com', 'age': 25})
 
# 不能改变元组引用的是哪个字典
# user_profile[1] = {"email": "newemail@example.com"}  # TypeError
 
# 但可以修改字典本身
user_profile[1]["age"] = 26
user_profile[1]["city"] = "New York"
print(user_profile)  # Output: ('alice', {'email': 'alice@example.com', 'age': 26, 'city': 'New York'})

为什么这对字典键很重要

只有当元组所有元素都可哈希时，元组才能作为字典键。尽管元组本身不可变，但只要元组中包含了可变对象（如列表），这个元组就根本不可哈希，因此不能用作字典键。

python

# 这能创建，但作为字典键是危险的
tuple_with_list = ("key", [1, 2, 3])
# data = {tuple_with_list: "value"}  # TypeError: unhashable type: 'list'

只有当元组完全由不可变对象（字符串、数字、frozensets、其他元组）组成时，才应将其用作字典键。

创建真正不可变的元组

如果你需要一个完全不可变的元组，请确保它的所有内容也都是不可变的：

python

# 完全不可变的元组——只包含不可变类型
point_3d = (10, 20, 30)  # 全是整数
rgb_color = (255, 128, 0)  # 全是整数
coordinates = ((10, 20), (30, 40))  # 元组的元组
 
# 这些可以安全地用作字典键
color_names = {
    (255, 0, 0): "Red",
    (0, 255, 0): "Green",
    (0, 0, 255): "Blue"
}
 
# 嵌套元组仍然不可变
nested = ((1, 2), (3, 4))
# nested[0][0] = 5  # TypeError: 'tuple' object does not support item assignment

何时有意使用可变内容

有时你确实想要一个包含可变内容的元组——例如，当你有一个固定的记录结构，但其中某个字段需要变化时：

python

# 学生记录：身份固定，但成绩会变化
def create_student(name, student_id):
    """Create a student record with empty grade list."""
    return (name, student_id, [])  # name and ID fixed, grades can change
 
student = create_student("Alice", "S12345")
print(student)  # Output: ('Alice', 'S12345', [])
 
# 学生身份是固定的
print(f"Student: {student[0]} (ID: {student[1]})")  # Output: Student: Alice (ID: S12345)
 
# 但我们可以随着获得成绩而添加
student[2].append(85)
student[2].append(92)
student[2].append(78)
print(f"Grades: {student[2]}")  # Output: Grades: [85, 92, 78]
 
# 元组结构能保护 name 和 ID 不被意外修改
# 同时允许成绩列表增长

当你想保护部分数据、同时允许另一部分数据变化时，这种模式很有用。只要注意区分元组的不可变性与其内容的可变性即可。

15.5) 何时用元组而不是列表

在元组与列表之间做选择，是一个重要的设计决策。它们虽然都是序列，但用途不同，也表达不同意图。

固定、异构数据使用元组

当你有固定数量的项目，它们共同表示一个逻辑实体，并且常常是不同类型时，元组最合适：

python

# 学生记录：姓名、年龄、专业、GPA
student = ("Alice", 20, "Computer Science", 3.8)
 
# 地理坐标：纬度、经度
location = (40.7128, -74.0060)  # New York City
 
# RGB 颜色：红、绿、蓝
color = (255, 128, 0)
 
# 数据库连接：主机、端口、数据库、用户名
db_connection = ("localhost", 5432, "myapp", "admin")
 
# 日期：年、月、日
date = (2024, 3, 15)

每个元组都表示一个完整的“记录”，其中每个元素的位置都有明确含义。第一个元素永远是姓名，第二个永远是年龄，依此类推。

同质集合使用列表

当你有数量可变的相似项集合，可能会添加、删除或重排时，列表最合适：

python

# 购物清单——同类型项目（字符串）
shopping_list = ["milk", "bread", "eggs", "butter"]
shopping_list.append("cheese")  # 按需添加更多项目
shopping_list.remove("bread")   # 删除项目
 
# 测试成绩——同类型项目（数字）
test_scores = [85, 92, 78, 95, 88]
test_scores.append(90)  # 添加新成绩
test_scores.sort()      # 重新排序成绩
 
# 用户名——同类型项目（字符串）
active_users = ["alice", "bob", "carol"]
active_users.extend(["dave", "eve"])  # 添加多个用户

列表用于项目数量可能变化、且每个项目扮演相同角色的集合。

函数返回值使用元组

当一个函数返回多个相关值时，元组是自然选择：

python

def get_user_info(user_id):
    """Retrieve user information from database."""
    # Simulate database lookup
    return "Alice", "alice@example.com", 25, "New York"
 
# 解包返回的元组
name, email, age, city = get_user_info(101)
print(f"{name} from {city}")  # Output: Alice from New York
 
def calculate_statistics(numbers):
    """Calculate min, max, and average of numbers."""
    if not numbers:
        return None, None, None
    
    minimum = min(numbers)
    maximum = max(numbers)
    average = sum(numbers) / len(numbers)
    return minimum, maximum, average
 
# 解包结果
min_val, max_val, avg_val = calculate_statistics([85, 92, 78, 95, 88])
print(f"Range: {min_val} to {max_val}, Average: {avg_val}")
# Output: Range: 78 to 95, Average: 87.6

返回元组能清楚表明这些值相互关联，应该一起被看待。

字典复合键使用元组

当你需要在字典里使用复合键时，元组必不可少：

python

# 按课程与学期存储学生成绩
grades = {
    ("CS101", "Fall2023"): 85,
    ("CS101", "Spring2024"): 90,
    ("MATH201", "Fall2023"): 88,
    ("MATH201", "Spring2024"): 92
}
 
# 查询某个具体成绩
course = "CS101"
semester = "Spring2024"
grade = grades[(course, semester)]
print(f"Grade in {course} ({semester}): {grade}")  # Output: Grade in CS101 (Spring2024): 90
 
# 网格坐标作为字典键
grid = {
    (0, 0): "Start",
    (5, 3): "Obstacle",
    (10, 10): "Goal"
}
 
position = (5, 3)
if position in grid:
    print(f"At {position}: {grid[position]}")  # Output: At (5, 3): Obstacle

列表不能作为字典键，因为它们是可变的；但元组可以。

不可变配置使用元组

当你有不应该改变的配置数据时，元组能表达这种意图：

python

# 应用设置：应保持常量
APP_CONFIG = (
    "MyApp",           # Application name
    "1.0.0",          # Version
    "production",     # Environment
    True,             # Debug mode
    8080              # Port
)
 
# UI 颜色调色板——这些颜色是固定的
COLOR_PALETTE = (
    (255, 0, 0),      # Primary red
    (0, 128, 255),    # Primary blue
    (255, 255, 255),  # White
    (0, 0, 0)         # Black
)
 
# API 端点——这些 URL 不会改变
API_ENDPOINTS = (
    "https://api.example.com/users",
    "https://api.example.com/products",
    "https://api.example.com/orders"
)

决策指南

python

# 使用元组（TUPLES）的场景：
# 1. 数据表示一个具有固定结构的单条记录
employee = ("E001", "Alice", "Engineering", 75000)
 
# 2. 从函数返回多个值
def divide_with_remainder(a, b):
    return a // b, a % b
 
# 3. 需要作为字典键使用
cache = {(5, 10): 50, (3, 7): 21}
 
# 4. 数据不应被修改
SCREEN_RESOLUTION = (1920, 1080)
 
# 使用列表（LISTS）的场景：
# 1. 由同类项目构成、且可能变化的集合
tasks = ["Write code", "Test code", "Deploy code"]
tasks.append("Document code")
 
# 2. 需要添加、删除或重新排序项目
scores = [85, 90, 78]
scores.sort()
scores.append(92)
 
# 3. 所有项目用途相同
usernames = ["alice", "bob", "carol"]
 
# 4. 集合大小事先未知
results = []
for i in range(10):
    results.append(i * 2)

15.6) 深入理解 range 对象

现在我们已经理解何时使用元组与列表，接下来探索 Python 的第三种不可变序列类型：范围(range)。range 类型表示一个不可变的数字序列。不同于会把所有元素存储在内存中的列表和元组，range 对象按需生成数字，这使它在表示大型序列时极其节省内存。

创建 range 对象

range() 函数可以用三种形式创建 range 对象：

python

# 单参数：range(stop)
# 生成从 0 到 stop（不包含 stop）的数字
numbers = range(5)
print(list(numbers))  # Output: [0, 1, 2, 3, 4]
 
# 两参数：range(start, stop)
# 生成从 start 到 stop（不包含 stop）的数字
numbers = range(2, 7)
print(list(numbers))  # Output: [2, 3, 4, 5, 6]
 
# 三参数：range(start, stop, step)
# 生成从 start 到 stop 的数字，每次递增 step
numbers = range(0, 10, 2)
print(list(numbers))  # Output: [0, 2, 4, 6, 8]
 
# 使用负步长进行倒计数
numbers = range(10, 0, -1)
print(list(numbers))  # Output: [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]

注意，我们用 list() 把 range 转成列表才能看见内容。range 对象本身在打印时不会展示所有值：

python

r = range(5)
print(r)  # Output: range(0, 5)
print(type(r))  # Output: <class 'range'>

range 对象如何工作

range 对象不会把所有值存储到内存中。相反，它会在需要时计算每个值：

python

import sys
 
# 一个表示一百万个数字的 range
large_range = range(1000000)
print(f"Range size: {sys.getsizeof(large_range)} bytes")  # Output: Range size: 48 bytes (may vary by Python version)
 
# 一个包含一百万个数字的列表
large_list = list(range(1000000))
print(f"List size: {sys.getsizeof(large_list)} bytes")  # Output: List size: 8000056 bytes (approximately 8MB)
 
# range 很小；列表很大！

range 对象只存储三个值：start、stop 和 step。只有当你请求时，它才会计算序列中的每个数字。这让 range 在处理大型序列时极其高效。

在 for 循环中使用 range

正如我们在第 12 章学到的，range 最常与 for 循环一起使用：

python

# 从 0 数到 4
for i in range(5):
    print(f"Count: {i}")
# Output:
# Count: 0
# Count: 1
# Count: 2
# Count: 3
# Count: 4
 
# 从 1 数到 10
for i in range(1, 11):
    print(i, end=" ")
print()  # Output: 1 2 3 4 5 6 7 8 9 10
 
# 按 2 计数
for i in range(0, 20, 2):
    print(i, end=" ")
print()  # Output: 0 2 4 6 8 10 12 14 16 18
 
# 倒数计数
for i in range(5, 0, -1):
    print(f"T-minus {i}")
# Output:
# T-minus 5
# T-minus 4
# T-minus 3
# T-minus 2
# T-minus 1

对 range 对象进行索引与切片

range 对象和其他序列一样支持索引与切片：

python

# 创建一个 range
numbers = range(10, 50, 5)  # 10, 15, 20, 25, 30, 35, 40, 45
 
# 索引
print(numbers[0])   # Output: 10
print(numbers[3])   # Output: 25
print(numbers[-1])  # Output: 45
 
# 切片会返回一个新的 range
subset = numbers[2:5]
print(subset)  # Output: range(20, 35, 5)
print(list(subset))  # Output: [20, 25, 30]
 
# 长度
print(len(numbers))  # Output: 8

成员测试

你可以使用 in 运算符检查一个数字是否在 range 中：

python

# 从 0 到 20 的偶数
evens = range(0, 21, 2)
 
print(10 in evens)  # Output: True
print(15 in evens)  # Output: False
print(20 in evens)  # Output: True
 
# 这非常高效——Python 不会生成所有数字
# 它会计算该数字是否会出现在序列中
large_range = range(0, 1000000, 3)
print(999999 in large_range)  # Output: True (instant, no iteration needed)

Python 可以通过数学计算而不是生成所有数字来判断成员关系，因此即使对巨大范围也非常快。

空范围与反向范围

python

# 空范围——stop 等于 start
empty = range(5, 5)
print(list(empty))  # Output: []
print(len(empty))   # Output: 0
 
# 空范围——给定 step 无法到达 stop
impossible = range(1, 10, -1)  # 不能用负 step 向上计数
print(list(impossible))  # Output: []
 
# 反向范围
backwards = range(10, 0, -1)
print(list(backwards))  # Output: [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
 
# 带负数的反向范围
negative_range = range(-5, -15, -2)
print(list(negative_range))  # Output: [-5, -7, -9, -11, -13]

何时使用 range 而不是列表

python

# 使用 range 的场景：
# 1. 你需要一个数字序列用于迭代
for i in range(100):
    # Process something 100 times
    pass
 
# 2. 你需要为一个序列提供索引
items = ["a", "b", "c", "d"]
for i in range(len(items)):
    print(f"Index {i}: {items[i]}")
 
# 3. 面对大序列时内存效率很重要
# 这几乎不占用额外内存
for i in range(1000000):
    if i % 100000 == 0:
        print(i)
 
# 使用列表的场景：
# 1. 你需要存储实际的值
squares = [1, 3, 5, 7, 10]
 
# 2. 你需要修改序列
numbers = list(range(5))
numbers[2] = 100  # Modify a value
numbers.append(200)  # Add a value
 
# 3. 你需要多次以不同操作使用该序列
data = list(range(10))
print(sum(data))
print(max(data))
print(sorted(data, reverse=True))

range 对象完美体现了 Python 的高效性。它提供了序列的所有好处，同时避免了存储每个元素所带来的内存成本。

15.7) 在列表、元组与范围之间转换

Python 可以轻松在不同序列类型之间转换。理解这些转换能帮助你为每种情况选择正确类型，并在需要时变换数据。

转换为列表

list() 函数可以把任何序列转换为列表：

python

# 元组转列表
student_tuple = ("Alice", 20, "CS")
student_list = list(student_tuple)
print(student_list)  # Output: ['Alice', 20, 'CS']
print(type(student_list))  # Output: <class 'list'>
 
# 现在我们可以修改它
student_list[1] = 21
student_list.append(3.8)
print(student_list)  # Output: ['Alice', 21, 'CS', 3.8]
 
# range 转列表
numbers = range(5)
numbers_list = list(numbers)
print(numbers_list)  # Output: [0, 1, 2, 3, 4]
 
# 字符串转列表（每个字符变成一个元素）
text = "Python"
chars = list(text)
print(chars)  # Output: ['P', 'y', 't', 'h', 'o', 'n']

当你需要修改序列，或需要使用列表特有的方法如 append()、sort()、remove() 时，转换为列表很有用。

转换为元组

tuple() 函数可以把任何序列转换为元组：

python

# 列表转元组
scores_list = [85, 90, 78, 92]
scores_tuple = tuple(scores_list)
print(scores_tuple)  # Output: (85, 90, 78, 92)
print(type(scores_tuple))  # Output: <class 'tuple'>
 
# 现在它不可变了
# scores_tuple[0] = 88  # TypeError: 'tuple' object does not support item assignment
 
# range 转元组
numbers = range(1, 6)
numbers_tuple = tuple(numbers)
print(numbers_tuple)  # Output: (1, 2, 3, 4, 5)
 
# 字符串转元组
text = "Hi"
chars_tuple = tuple(text)
print(chars_tuple)  # Output: ('H', 'i')

当你想保护数据不被修改，或者需要将一个序列用作字典键时，转换为元组很有用。

15.8) 字符串、列表、元组与范围的通用序列操作

Python 的序列类型——字符串(strings)、列表、元组与范围——共享许多通用操作。理解这些共享操作能帮助你高效地处理任何序列类型。

长度、最小值与最大值

所有序列都支持 len()、min() 与 max() 函数：

python

# 字符串
text = "Python"
print(len(text))  # Output: 6
print(min(text))  # Output: P (smallest character by Unicode value)
print(max(text))  # Output: y (largest character by Unicode value)
 
# 列表
numbers = [45, 12, 78, 23, 56]
print(len(numbers))  # Output: 5
print(min(numbers))  # Output: 12
print(max(numbers))  # Output: 78
 
# 元组
scores = (85, 92, 78, 95, 88)
print(len(scores))  # Output: 5
print(min(scores))  # Output: 78
print(max(scores))  # Output: 95
 
# 范围
nums = range(10, 50, 5)
print(len(nums))  # Output: 8
print(min(nums))  # Output: 10
print(max(nums))  # Output: 45

要让 min() 与 max() 生效，元素必须可比较。你不能对同时包含字符串与数字的列表求最小值：

python

mixed = [1, "hello", 3]
# print(min(mixed))  # TypeError: '<' not supported between instances of 'str' and 'int'

索引与负索引

所有序列都支持使用正索引和负索引进行访问：

python

# 正索引（从 0 开始）
text = "Python"
numbers = [10, 20, 30, 40, 50]
coords = (5, 10, 15)
values = range(0, 100, 10)
 
print(text[0])      # Output: P
print(numbers[2])   # Output: 30
print(coords[1])    # Output: 10
print(values[3])    # Output: 30
 
# 负索引（从末尾开始）
print(text[-1])     # Output: n (last character)
print(numbers[-2])  # Output: 40 (second from end)
print(coords[-3])   # Output: 5 (third from end, which is first)
print(values[-1])   # Output: 90 (last value in range)

负索引从末尾计数：-1 是最后一个元素，-2 是倒数第二个，依此类推。

使用 in 与 not in 进行成员测试

所有序列都支持成员测试：

python

# 字符串——检查子串
text = "Python Programming"
print("Python" in text)      # Output: True
print("Java" in text)        # Output: False
print("gram" in text)        # Output: True (substring)
print("PYTHON" not in text)  # Output: True (case-sensitive)
 
# 列表
fruits = ["apple", "banana", "cherry", "date"]
print("banana" in fruits)    # Output: True
print("grape" in fruits)     # Output: False
print("apple" not in fruits) # Output: False
 
# 元组
coordinates = (10, 20, 30, 40)
print(20 in coordinates)     # Output: True
print(25 in coordinates)     # Output: False
print(50 not in coordinates) # Output: True
 
# 范围——非常高效，不需要迭代
numbers = range(0, 100, 2)  # Even numbers 0 to 98
print(50 in numbers)         # Output: True
print(51 in numbers)         # Output: False (odd number)
print(100 in numbers)        # Output: False (stop is exclusive)

对于 range，Python 可以通过数学计算判断成员关系，而不必检查每一个元素，因此即便范围很大也非常快。

拼接与重复

字符串、列表和元组支持用 + 拼接，用 * 重复：

python

# 使用 + 拼接
text1 = "Hello"
text2 = " World"
print(text1 + text2)  # Output: Hello World
 
list1 = [1, 2, 3]
list2 = [4, 5, 6]
print(list1 + list2)  # Output: [1, 2, 3, 4, 5, 6]
 
tuple1 = (10, 20)
tuple2 = (30, 40)
print(tuple1 + tuple2)  # Output: (10, 20, 30, 40)
 
# 使用 * 重复
print("Ha" * 3)           # Output: HaHaHa
print([0] * 5)            # Output: [0, 0, 0, 0, 0]
print((1, 2) * 3)         # Output: (1, 2, 1, 2, 1, 2)

重要：range 不支持拼接或重复：

python

r1 = range(5)
r2 = range(5, 10)
# combined = r1 + r2  # TypeError: unsupported operand type(s) for +: 'range' and 'range'
 
# 要组合 range，先转换为列表或元组
combined = list(r1) + list(r2)
print(combined)  # Output: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

计数出现次数

count() 方法会返回某个元素出现的次数：

python

# 字符串——统计子串出现次数
text = "Mississippi"
print(text.count("s"))   # Output: 4
print(text.count("ss"))  # Output: 2
print(text.count("i"))   # Output: 4
 
# 列表
numbers = [1, 2, 3, 2, 4, 2, 5]
print(numbers.count(2))  # Output: 3
print(numbers.count(6))  # Output: 0
 
# 元组
grades = (85, 90, 85, 92, 85, 88)
print(grades.count(85))  # Output: 3
print(grades.count(95))  # Output: 0
 
# range 没有 count() 方法，但可以先转换
nums = range(0, 20, 2)
nums_list = list(nums)
print(nums_list.count(10))  # Output: 1

查找元素索引位置

index() 方法会返回第一次出现的位置：

python

# 字符串
text = "Python Programming"
print(text.index("P"))      # Output: 0 (first P)
print(text.index("Pro"))    # Output: 7 (substring position)
# print(text.index("Java"))  # ValueError: substring not found
 
# 列表
fruits = ["apple", "banana", "cherry", "banana"]
print(fruits.index("banana"))  # Output: 1 (first occurrence)
print(fruits.index("cherry"))  # Output: 2
# print(fruits.index("grape"))  # ValueError: 'grape' is not in list
 
# 元组
coordinates = (10, 20, 30, 20, 40)
print(coordinates.index(20))  # Output: 1 (first occurrence)
print(coordinates.index(40))  # Output: 4
 
# range 没有 index() 方法，但可以先转换
nums = range(10, 50, 5)
nums_list = list(nums)
print(nums_list.index(25))  # Output: 3

如果找不到元素，index() 会抛出 ValueError。为了避免这种情况，可以先用 in 检查：

python

fruits = ["apple", "banana", "cherry"]
search_fruit = "grape"
 
if search_fruit in fruits:
    position = fruits.index(search_fruit)
    print(f"{search_fruit} found at position {position}")
else:
    print(f"{search_fruit} not found")
# Output: grape not found

使用 for 循环迭代

所有序列都可以用 for 循环迭代：

python

# 字符串——迭代字符
for char in "Python":
    print(char, end=" ")
print()  # Output: P y t h o n
 
# 列表
for fruit in ["apple", "banana", "cherry"]:
    print(f"I like {fruit}")
# Output:
# I like apple
# I like banana
# I like cherry
 
# 元组
for score in (85, 90, 78):
    print(f"Score: {score}")
# Output:
# Score: 85
# Score: 90
# Score: 78
 
# range
for i in range(1, 6):
    print(f"Count: {i}")
# Output:
# Count: 1
# Count: 2
# Count: 3
# Count: 4
# Count: 5

比较操作

序列可以使用 ==、!=、<、>、<=、>= 进行比较：

python

# 相等比较
print([1, 2, 3] == [1, 2, 3])      # Output: True
print((1, 2, 3) == (1, 2, 3))      # Output: True
print("abc" == "abc")               # Output: True
 
# 不等比较
print([1, 2, 3] != [1, 2, 4])      # Output: True
print((1, 2) != (1, 2))            # Output: False
 
# 字典序比较（逐元素比较）
print([1, 2, 3] < [1, 2, 4])       # Output: True (3 < 4)
print([1, 2, 3] < [1, 3, 0])       # Output: True (2 < 3)
print("apple" < "banana")           # Output: True (alphabetical)
print((1, 2) < (1, 2, 3))          # Output: True (shorter is less if equal so far)
 
# 比较不同类型
print([1, 2, 3] == (1, 2, 3))      # Output: False (different types)

比较会从左到右逐元素进行。第一个不同的元素决定比较结果。

理解这些通用操作能让你写出适用于任何序列类型的代码，让程序更灵活、更可复用。

15.9) 适用于所有序列类型的高级切片

切片是 Python 处理序列最强大的特性之一。虽然我们在第 14 章介绍了基础切片，但还有一些适用于所有序列类型的高级切片技巧。

基础切片回顾

切片使用语法 sequence[start:stop:step] 从序列中提取一部分：

python

# 字符串的基础切片
text = "Python Programming"
print(text[0:6])    # Output: Python
print(text[7:18])   # Output: Programming
print(text[7:])     # Output: Programming (from index 7 to end)
print(text[:6])     # Output: Python (from start to index 6)
 
# 列表的基础切片
numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
print(numbers[2:7])   # Output: [2, 3, 4, 5, 6]
print(numbers[:5])    # Output: [0, 1, 2, 3, 4]
print(numbers[5:])    # Output: [5, 6, 7, 8, 9]
 
# 元组的基础切片
coordinates = (10, 20, 30, 40, 50, 60)
print(coordinates[1:4])  # Output: (20, 30, 40)
print(coordinates[:3])   # Output: (10, 20, 30)
print(coordinates[3:])   # Output: (40, 50, 60)
 
# range 的基础切片
nums = range(0, 100, 10)
print(list(nums[2:5]))   # Output: [20, 30, 40]

记住：start 是包含的，stop 是不包含的，并且结果的类型总与原始序列类型一致。

在切片中使用 step

可选的第三个参数 step 控制跳过多少个元素：

python

# 每隔一个元素取一个
numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
print(numbers[::2])     # Output: [0, 2, 4, 6, 8]
print(numbers[1::2])    # Output: [1, 3, 5, 7, 9]
 
# 每隔两个元素取一个（即每第三个元素）
text = "abcdefghijklmnop"
print(text[::3])        # Output: adgjmp
 
# 同时使用 start、stop 与 step
print(numbers[2:8:2])   # Output: [2, 4, 6]
print(text[1:10:2])     # Output: bdfhj

负 step：反转序列

负步长会反转切片方向：

python

# 反转整个序列
text = "Python"
print(text[::-1])       # Output: nohtyP
 
numbers = [1, 2, 3, 4, 5]
print(numbers[::-1])    # Output: [5, 4, 3, 2, 1]
 
coordinates = (10, 20, 30, 40)
print(coordinates[::-1])  # Output: (40, 30, 20, 10)
 
# 反转并跳步
numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
print(numbers[::-2])    # Output: [9, 7, 5, 3, 1] (every second, backwards)
 
# 反转其中一段
text = "Python Programming"
print(text[7:18][::-1])  # Output: gnimmargorP (reverse "Programming")

当使用负 step 时，start 与 stop 的行为会有所不同：

python

numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
 
# 使用负 step 时，start 应该大于 stop
print(numbers[7:2:-1])   # Output: [7, 6, 5, 4, 3] (from 7 down to 3)
print(numbers[8:3:-2])   # Output: [8, 6, 4] (from 8 down to 4, step -2)
 
# 在负 step 下省略 start/stop
print(numbers[:5:-1])    # Output: [9, 8, 7, 6] (from end down to 6)
print(numbers[5::-1])    # Output: [5, 4, 3, 2, 1, 0] (from 5 down to start)

切片中使用负索引

你可以对 start 与 stop 位置使用负索引：

python

text = "Python Programming"
# 最后 11 个字符
print(text[-11:])        # Output: Programming
 
# 除最后 11 个字符之外的所有内容
print(text[:-11])        # Output: Python
 
# 从 -15 到 -5
print(text[-15:-5])      # Output: hon Progra
 
numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
# 最后 5 个元素
print(numbers[-5:])      # Output: [5, 6, 7, 8, 9]
 
# 除最后 3 个元素之外的所有内容
print(numbers[:-3])      # Output: [0, 1, 2, 3, 4, 5, 6]
 
# 从 -7 到 -2
print(numbers[-7:-2])    # Output: [3, 4, 5, 6, 7]

对 range 进行切片

对 range 切片会返回一个新的 range 对象：

python

# 对 range 切片
numbers = range(0, 100, 5)  # 0, 5, 10, 15, ..., 95
print(numbers)  # Output: range(0, 100, 5)
 
# 切片返回新的 range
subset = numbers[5:10]
print(subset)  # Output: range(25, 50, 5)
print(list(subset))  # Output: [25, 30, 35, 40, 45]
 
# 带 step
every_other = numbers[::2]
print(list(every_other))  # Output: [0, 10, 20, 30, 40, 50, 60, 70, 80, 90]
 
# 负 step
reversed_range = numbers[::-1]
print(list(reversed_range))  # Output: [95, 90, 85, ..., 5, 0]

空切片与边界情况

python

numbers = [1, 2, 3, 4, 5]
 
# 空切片（step 为正且 start >= stop）
print(numbers[3:3])    # Output: []
print(numbers[5:10])   # Output: [] (stop beyond length)
print(numbers[10:20])  # Output: [] (both beyond length)
 
# 越界切片是安全的
print(numbers[-100:100])  # Output: [1, 2, 3, 4, 5] (entire sequence)
print(numbers[2:100])     # Output: [3, 4, 5] (from 2 to end)
 
# 负 step 但 start/stop 不兼容
print(numbers[2:7:-1])    # Output: [] (can't go forward with negative step)
 
# step 不能为 0
# print(numbers[::0])  # ValueError: slice step cannot be zero

用切片进行复制

切片会创建一个新序列，因此可以用来复制：

python

# 用切片复制
original = [1, 2, 3, 4, 5]
copy = original[:]  # 从头切到尾
print(copy)  # Output: [1, 2, 3, 4, 5]
 
# 修改副本不会影响原始列表
copy[0] = 100
print(f"Original: {original}")  # Output: Original: [1, 2, 3, 4, 5]
print(f"Copy: {copy}")          # Output: Copy: [100, 2, 3, 4, 5]
 
# 对元组也适用（创建新元组）
original_tuple = (1, 2, 3, 4, 5)
copy_tuple = original_tuple[:]
print(copy_tuple)  # Output: (1, 2, 3, 4, 5)
 
# 对字符串也适用
text = "Python"
text_copy = text[:]
print(text_copy)  # Output: Python

不过，请记住第 14 章提到过：这会创建一个浅拷贝(shallow copy)。

python

# 浅拷贝的限制
original = [[1, 2], [3, 4]]
copy = original[:]
 
# 修改嵌套列表会影响两者
copy[0][0] = 100
print(f"Original: {original}")  # Output: Original: [[100, 2], [3, 4]]
print(f"Copy: {copy}")          # Output: Copy: [[100, 2], [3, 4]]

元组与范围是 Python 序列工具箱中必不可少的工具。元组提供不可变且结构化的数据，既能防止信息被意外修改，又能用作字典键。范围则以节省内存的方式表示数字序列，非常适合循环与大型序列。理解何时使用每种类型——以及如何在它们之间转换——会让你的代码更高效、更安全、意图更清晰。

所有序列类型共享的通用操作——索引、切片、迭代、成员测试——构成了一致的接口，使处理任何序列都直观易懂。高级切片技术则为你提供了强大而富有表达力的方法，用于提取与操作序列数据。

随着你继续用 Python 编程，你会自然地为每种场景选择合适的序列类型：列表用于会变化的集合，元组用于固定记录，范围用于数字序列，字符串用于文本。本章已经为你提供了相关知识，让你能自信地做出这些选择，并有效地使用每种类型。