Python & AI Tutorials Logo
Python Programming

5. Working with Text Using Strings

Text is everywhere in programming. From displaying messages to users, to processing data files, to building web applications, working with text is one of the most fundamental skills you'll develop as a programmer. In Python, we work with text using strings — sequences of characters that can represent words, sentences, or any kind of textual data.

You've already encountered strings briefly in previous chapters when using print() and input(). Now we'll explore strings in depth, learning how to create them, manipulate them, and use Python's powerful built-in string capabilities to solve real-world text-processing problems.

In this chapter, you'll learn how to create strings with special characters, combine strings together, extract specific parts of strings, change their case and formatting, search for text within strings, and understand why strings behave differently from numbers when it comes to modification. By the end, you'll have a solid foundation for working with text in Python.

5.1) String Literals and Escape Sequences

5.1.1) Creating String Literals

A string literal is a string value written directly in your code. You've already seen strings created with single quotes (') or double quotes ("):

python
# string_basics.py
greeting = 'Hello, World!'
message = "Python is great!"
 
print(greeting)  # Output: Hello, World!
print(message)   # Output: Python is great!

Both single and double quotes work identically in Python — the choice is yours. However, having both options is useful when your string itself contains quote characters:

python
# quotes_in_strings.py
# Using double quotes when the string contains a single quote
sentence = "It's a beautiful day!"
print(sentence)  # Output: It's a beautiful day!
 
# Using single quotes when the string contains double quotes
quote = 'She said, "Hello!"'
print(quote)  # Output: She said, "Hello!"

If you need to include the same type of quote that surrounds your string, you can escape it with a backslash (\):

python
# escaping_quotes.py
# Escaping a single quote inside a single-quoted string
sentence = 'It\'s a beautiful day!'
print(sentence)  # Output: It's a beautiful day!
 
# Escaping double quotes inside a double-quoted string
quote = "She said, \"Hello!\""
print(quote)  # Output: She said, "Hello!"

The backslash tells Python that the following quote is part of the string content, not the end of the string.

5.1.2) Multi-Line Strings with Triple Quotes

For strings that span multiple lines, Python provides triple quotes — either three single quotes (''') or three double quotes ("""):

python
# multiline_strings.py
poem = """Roses are red,
Violets are blue,
Python is awesome,
And so are you!"""
 
print(poem)
# Output:
# Roses are red,
# Violets are blue,
# Python is awesome,
# And so are you!

Triple-quoted strings preserve all the line breaks and spacing exactly as you type them. They're particularly useful for longer text blocks, documentation strings (which we'll see in Chapter 19), or when you need to include both single and double quotes without escaping:

python
# triple_quotes_convenience.py
dialogue = '''The teacher said, "Don't forget: it's important to practice!"'''
print(dialogue)  # Output: The teacher said, "Don't forget: it's important to practice!"

5.1.3) Common Escape Sequences

Beyond escaping quotes, the backslash introduces escape sequences — special two-character combinations that represent characters that are difficult or impossible to type directly:

python
# escape_sequences.py
# Newline: moves to the next line
print("First line\nSecond line")
# Output:
# First line
# Second line
 
# Tab: inserts horizontal spacing
print("Name:\tJohn\nAge:\t25")
# Output:
# Name:	John
# Age:	25
 
# Backslash: to include a literal backslash
path = "C:\\Users\\Documents"
print(path)  # Output: C:\Users\Documents

Here are the most commonly used escape sequences:

Escape SequenceMeaningExample Output
\nNewline (line break)Two lines
\tTab (horizontal spacing)Indented text
\\Backslash\ character
\'Single quote' character
\"Double quote" character

Understanding escape sequences is crucial when working with file paths (especially on Windows, which uses backslashes), formatted output, or any text that needs special formatting.

5.1.4) Raw Strings for Literal Backslashes

Sometimes you want backslashes to be treated literally, without triggering escape sequences. This is common when working with file paths or regular expressions (which we'll briefly touch on in Chapter 39). Python provides raw strings by prefixing the string with r:

python
# raw_strings.py
# Regular string: backslashes trigger escape sequences
regular = "C:\new\test"
print(regular)  # Output: C:
                #         ew   est
                # (\n becomes newline, \t becomes tab)
 
# Raw string: backslashes are literal
raw = r"C:\new\test"
print(raw)  # Output: C:\new\test

In a raw string, \n is literally the two characters backslash and n, not a newline. Raw strings are particularly useful for Windows file paths:

python
# windows_paths.py
# Without raw string, you need to escape every backslash
path1 = "C:\\Users\\John\\Documents\\file.txt"
 
# With raw string, backslashes work naturally
path2 = r"C:\Users\John\Documents\file.txt"
 
print(path1)  # Output: C:\Users\John\Documents\file.txt
print(path2)  # Output: C:\Users\John\Documents\file.txt

Both approaches produce the same result, but raw strings are more readable when you have many backslashes.

5.2) Concatenation and String Repetition

5.2.1) Concatenating Strings with +

You can combine strings together using the + operator, which is called concatenation:

python
# string_concatenation.py
first_name = "John"
last_name = "Smith"
 
# Combining strings with +
full_name = first_name + " " + last_name
print(full_name)  # Output: John Smith
 
# Building longer strings
greeting = "Hello, " + full_name + "!"
print(greeting)  # Output: Hello, John Smith!

Concatenation creates a new string by placing the strings end-to-end. Note that Python doesn't automatically add spaces — you need to include them explicitly:

python
# concatenation_spacing.py
word1 = "Hello"
word2 = "World"
 
# Without space
no_space = word1 + word2
print(no_space)  # Output: HelloWorld
 
# With space
with_space = word1 + " " + word2
print(with_space)  # Output: Hello World

You can concatenate as many strings as you want in a single expression:

python
# multiple_concatenation.py
address = "123" + " " + "Main" + " " + "Street"
print(address)  # Output: 123 Main Street

Important limitation: You can only concatenate strings with other strings. Trying to concatenate a string with a number will cause an error:

python
# concatenation_error.py
age = 25
# This will cause an error:
# message = "I am " + age + " years old"  # TypeError!
 
# You must convert the number to a string first
message = "I am " + str(age) + " years old"
print(message)  # Output: I am 25 years old

We'll explore string-number conversion in more detail in Section 5.6.

5.2.2) Repeating Strings with *

Python provides a convenient way to repeat a string multiple times using the * operator:

python
# string_repetition.py
separator = "-" * 20
print(separator)  # Output: --------------------
 
# Creating patterns
pattern = "abc" * 3
print(pattern)  # Output: abcabcabc
 
# Useful for formatting output
print("=" * 30)
print("Important Message")
print("=" * 30)
# Output:
# ==============================
# Important Message
# ==============================

The repetition operator works with any positive integer:

python
# repetition_examples.py
# Repeating zero times gives an empty string
nothing = "Hello" * 0
print(nothing)      # Output: (empty string)
print(len(nothing)) # Output: 0
 
# Repeating once gives the original string
once = "Hello" * 1
print(once)  # Output: Hello
 
# Larger repetitions
many = "Go! " * 5
print(many)  # Output: Go! Go! Go! Go! Go!

String repetition is particularly useful for creating visual separators, padding, or generating test data:

python
# practical_repetition.py
# Creating a simple text box
width = 40
border = "=" * width
title = "Welcome"
padding = " " * ((width - len(title)) // 2)
 
print(border)
print(padding + title)
print(border)
# Output:
# ========================================
#                 Welcome
# ========================================

5.2.3) Combining Concatenation and Repetition

You can combine both operators in the same expression, following Python's operator precedence rules (multiplication before addition, just like with numbers):

python
# combined_operations.py
# Repetition happens first, then concatenation
result = "=" * 10 + " Title " + "=" * 10
print(result)  # Output: ========== Title ==========
 
# Using parentheses to control order
repeated_phrase = ("Hello " + "World ") * 3
print(repeated_phrase)  # Output: Hello World Hello World Hello World

These operations form the foundation of string manipulation, allowing you to build complex strings from simple pieces.

5.3) Indexing and Slicing Strings

Strings in Python are sequences of characters, which means each character has a specific position. You can access individual characters or extract portions of a string using indexing and slicing.

5.3.1) Understanding String Indices

Each character in a string has a numerical position called an index. Python uses zero-based indexing, meaning the first character is at index 0, the second at index 1, and so on:

python
# string_indexing.py
text = "Python"
 
# Accessing individual characters by index
print(text[0])  # Output: P (first character)
print(text[1])  # Output: y (second character)
print(text[5])  # Output: n (sixth character)

Here's a visual representation of how indices map to characters:

String:  P  y  t  h  o  n
Index:   0  1  2  3  4  5

Python also supports negative indices, which count from the end of the string. Index -1 refers to the last character, -2 to the second-to-last, and so on:

python
# negative_indexing.py
text = "Python"
 
print(text[-1])  # Output: n (last character)
print(text[-2])  # Output: o (second-to-last)
print(text[-6])  # Output: P (first character)

Negative indices are particularly useful when you want to access characters near the end of a string without knowing its exact length:

String:    P  y  t  h  o  n
Positive:  0  1  2  3  4  5
Negative: -6 -5 -4 -3 -2 -1

String: 'Python'

Positive Indexing

Negative Indexing

[0] = 'P'
[1] = 'y'
[2] = 't'
[3] = 'h'
[4] = 'o'
[5] = 'n'

[-6] = 'P'
[-5] = 'y'
[-4] = 't'
[-3] = 'h'
[-2] = 'o'
[-1] = 'n'

Important: Trying to access an index that doesn't exist will cause an IndexError:

python
# index_error.py
text = "Python"
 
# This works fine
print(text[5])  # Output: n
 
# This causes an error because index 6 doesn't exist
# print(text[6])  # IndexError: string index out of range

5.3.2) Slicing Strings to Extract Substrings

While indexing gives you a single character, slicing lets you extract a portion of a string (called a substring). The basic syntax is:

string[start:stop]

This extracts characters from index start up to (but not including) index stop:

python
# basic_slicing.py
text = "Python Programming"
 
# Extract characters from index 0 up to (but not including) 6
print(text[0:6])  # Output: Python
 
# Extract characters from index 7 up to 18
print(text[7:18])  # Output: Programming
 
# Extract a middle portion
print(text[7:11])  # Output: Prog

The key thing to remember is that the stop index is not included in the result. Think of the indices as pointing between characters:

 P  y  t  h  o  n
 0  1  2  3  4  5  6

So text[0:6] means "start at position 0 and stop before position 6", giving you characters at positions 0, 1, 2, 3, 4, and 5.

5.3.3) Omitting Start or Stop Indices

You can omit the start index to slice from the beginning, or omit the stop index to slice to the end:

python
# omitting_indices.py
text = "Python Programming"
 
# From beginning to index 6
print(text[:6])  # Output: Python
 
# From index 7 to the end
print(text[7:])  # Output: Programming
 
# The entire string (from beginning to end)
print(text[:])  # Output: Python Programming

These shortcuts are very common in Python code because they make intentions clear and avoid hardcoding lengths.

5.3.4) Using Negative Indices in Slices

Negative indices work in slices too, allowing you to count from the end:

python
# negative_slice_indices.py
text = "Python Programming"
 
# Last 11 characters
print(text[-11:])  # Output: Programming
 
# Everything except the last 11 characters
print(text[:-11])  # Output: Python
 
# Last 7 characters
print(text[-7:])  # Output: ramming
 
# From index 7 to the third-to-last character
print(text[7:-3])  # Output: Programm (stops before 'ing')

Negative indices are especially useful when you want to exclude a certain number of characters from the end:

python
# removing_suffix.py
filename = "document.txt"
 
# Get everything except the last 4 characters (.txt)
name_without_extension = filename[:-4]
print(name_without_extension)  # Output: document

5.3.5) Slicing with a Step Value

Slices can include a third value called the step, which determines how many characters to skip:

string[start:stop:step]
python
# slicing_with_step.py
text = "Python Programming"
 
# Every second character from the entire string
print(text[::2])  # Output: Pto rgamn
 
# Every second character from index 0 to 6
print(text[0:6:2])  # Output: Pto
 
# Every third character
print(text[::3])  # Output: Ph oai

A particularly useful trick is using a step of -1 to reverse a string:

python
# reversing_strings.py
text = "Python"
 
# Reverse the entire string
reversed_text = text[::-1]
print(reversed_text)  # Output: nohtyP
 
# Practical example: checking for palindromes
word = "radar"
if word == word[::-1]:
    print(f"{word} is a palindrome!")  # Output: radar is a palindrome!

5.3.6) Slicing Never Causes Errors

Unlike indexing, slicing is very forgiving. If you specify indices that are out of range, Python simply adjusts them to fit:

python
# safe_slicing.py
text = "Python"
 
# These all work without errors
print(text[0:100])   # Output: Python (stops at end)
print(text[10:20])   # Output: (empty string - start is beyond end)
print(text[-100:3])  # Output: Pyt (start adjusted to 0)

This behavior makes slicing safe to use even when you're not sure of the exact string length.

5.3.7) Practical Slicing Examples

Here are some common patterns you'll use frequently:

python
# practical_slicing.py
text = "Hello, World!"
 
# First 5 characters
print(text[:5])  # Output: Hello
 
# Last 6 characters
print(text[-6:])  # Output: World!
 
# Everything except first and last character
print(text[1:-1])  # Output: ello, World
 
# Every other character
print(text[::2])  # Output: Hlo ol!
 
# Reverse the string
print(text[::-1])  # Output: !dlroW ,olleH

Understanding indexing and slicing is fundamental to text processing in Python. These techniques will appear repeatedly throughout your programming journey.

5.4) Common String Methods for Case and Whitespace

Python strings come with many built-in methods — functions that are attached to string objects and perform operations on them. In this section, we'll explore methods for changing case and managing whitespace, which are essential for cleaning and formatting text.

5.4.1) Understanding String Methods

A method is called using dot notation: string.method_name(). Methods are functions that belong to a specific type of object. For strings, Python provides dozens of useful methods:

python
# method_basics.py
text = "hello"
 
# Calling a method on a string
result = text.upper()
print(result)  # Output: HELLO
 
# The original string is unchanged (we'll discuss why in Section 5.8)
print(text)  # Output: hello

Methods can be chained together because each method returns a new string:

python
# method_chaining.py
text = "  hello world  "
 
# Chain multiple methods
result = text.strip().upper().replace("WORLD", "PYTHON")
print(result)  # Output: HELLO PYTHON

5.4.2) Case Conversion Methods

Python provides several methods for changing the case of strings:

python
# case_methods.py
text = "Python Programming"
 
# Convert to uppercase
print(text.upper())  # Output: PYTHON PROGRAMMING
 
# Convert to lowercase
print(text.lower())  # Output: python programming
 
# Capitalize first letter, lowercase the rest
print(text.capitalize())  # Output: Python programming
 
# Title case: capitalize first letter of each word
print(text.title())  # Output: Python Programming

These methods are particularly useful for standardizing user input:

python
# case_normalization.py
# Simulating user input
user_input = "YES"
 
# Case-insensitive comparison
if user_input.lower() == "yes":
    print("User confirmed!")  # Output: User confirmed!
 
# Another approach using upper()
command = "start"
if command.upper() == "START":
    print("Starting process...")  # Output: Starting process...

The title() method capitalizes the first letter of each word, which is useful for formatting names and titles:

python
# title_case.py
name = "john smith"
print(name.title())  # Output: John Smith
 
book = "the great gatsby"
print(book.title())  # Output: The Great Gatsby

However, be aware that title() has limitations with apostrophes and special cases:

python
# title_limitations.py
text = "it's a beautiful day"
print(text.title())  # Output: It'S A Beautiful Day (note the capital S)
 
# For more sophisticated title casing, you might need custom logic

The capitalize() method only capitalizes the very first character of the entire string:

python
# capitalize_examples.py
sentence = "python is great"
print(sentence.capitalize())  # Output: Python is great
 
# Note: only the first letter is capitalized
multi_word = "hello world"
print(multi_word.capitalize())  # Output: Hello world (not Hello World)

5.4.3) Case-Checking Methods

Python also provides methods to check the case of strings:

python
# case_checking.py
text1 = "HELLO"
text2 = "hello"
text3 = "Hello World"
 
# Check if all characters are uppercase
print(text1.isupper())  # Output: True
print(text2.isupper())  # Output: False
 
# Check if all characters are lowercase
print(text1.islower())  # Output: False
print(text2.islower())  # Output: True
 
# Check if string is in title case
print(text3.istitle())  # Output: True
print(text2.istitle())  # Output: False

These checking methods return True or False (Boolean values we learned about in Chapter 3), making them perfect for conditions:

python
# case_checking_conditions.py
password = "SECRET123"
 
if password.isupper():
    print("Password is all uppercase")  # Output: Password is all uppercase

5.4.4) Whitespace Removal Methods

Whitespace includes spaces, tabs (\t), and newlines (\n). Python provides methods to remove whitespace from the edges of strings:

python
# whitespace_removal.py
text = "   Hello, World!   "
 
# Remove whitespace from both ends
print(text.strip())  # Output: Hello, World!
 
# Remove whitespace from the left (start)
print(text.lstrip())  # Output: Hello, World!   
 
# Remove whitespace from the right (end)
print(text.rstrip())  # Output:    Hello, World!

The strip() method is extremely useful for cleaning user input:

python
# cleaning_input.py
# Simulating user input with extra spaces
user_name = "  John Smith  "
 
# Clean up the input
clean_name = user_name.strip()
print(f"Welcome, {clean_name}!")  # Output: Welcome, John Smith!

These methods also remove tabs and newlines:

python
# strip_all_whitespace.py
text = "\n\t  Hello  \t\n"
print(repr(text))  # Output: '\n\t  Hello  \t\n'
 
cleaned = text.strip()
print(repr(cleaned))  # Output: 'Hello'

Note that strip(), lstrip(), and rstrip() only remove whitespace from the edges, not from the middle:

python
# strip_edges_only.py
text = "  Hello   World  "
print(text.strip())  # Output: Hello   World (spaces in middle remain)

5.4.5) Removing Specific Characters

The strip methods can also remove specific characters (not just whitespace) from the edges:

python
# Remove multiple different characters
text = "...Hello!!!"
cleaned = text.strip(".!")
print(cleaned)  # Output: Hello

When you pass a string to strip(), it removes any combination of those characters from the edges:

python
# strip_character_set.py
text = "xxxyyyHelloyyyxxx"
 
# Remove any x's or y's from both ends
result = text.strip("xy")
print(result)  # Output: Hello

5.4.6) Practical Examples Combining Case and Whitespace Methods

Here are real-world scenarios where these methods are invaluable:

python
# practical_text_cleaning.py
# Cleaning and standardizing user input
user_email = "  JohnSmith@EXAMPLE.com  "
clean_email = user_email.strip().lower()
print(clean_email)  # Output: johnsmith@example.com
 
# Formatting names properly
raw_name = "  john smith  "
formatted_name = raw_name.strip().title()
print(formatted_name)  # Output: John Smith
 
# Processing commands (case-insensitive)
command = "  START  "
if command.strip().upper() == "START":
    print("Command recognized!")  # Output: Command recognized!

These methods form the foundation of text cleaning and normalization, which you'll use constantly when processing user input, reading files, or preparing data for analysis.

5.5) Searching and Replacing in Strings

Finding and modifying text within strings is a common task in programming. Python provides powerful methods for searching for substrings and replacing text.

5.5.1) Finding Substrings with find() and index()

The find() method searches for a substring and returns the index where it first appears:

python
# find_method.py
text = "Python is great. Python is powerful."
 
# Find the first occurrence of "Python"
position = text.find("Python")
print(position)  # Output: 0 (found at the beginning)
 
# Find "great"
position = text.find("great")
print(position)  # Output: 10
 
# Find something that doesn't exist
position = text.find("Java")
print(position)  # Output: -1 (not found)

The find() method returns -1 if the substring isn't found, which makes it safe to use without causing errors:

python
# safe_searching.py
text = "Hello, World!"
 
# Check if a substring exists
if text.find("World") != -1:
    print("Found 'World'!")  # Output: Found 'World'!
 
if text.find("Python") == -1:
    print("'Python' not found")  # Output: 'Python' not found

You can also search for a substring starting from a specific position:

python
# find_with_start.py
text = "Python is great. Python is powerful."
 
# Find first occurrence
first = text.find("Python")
print(first)  # Output: 0
 
# Find next occurrence after the first one
second = text.find("Python", first + 1)
print(second)  # Output: 17

The index() method works similarly to find(), but raises an error if the substring isn't found:

python
# index_method.py
text = "Hello, World!"
 
# This works fine
position = text.index("World")
print(position)  # Output: 7
 
# This would cause a ValueError:
# position = text.index("Python")  # ValueError: substring not found

When to use which:

  • Use find() when you want to check if something exists (returns -1 if not found)
  • Use index() when you expect the substring to be there (raises error if not found)
python
# choosing_find_vs_index.py
text = "Python Programming"
 
# Using find() for safe checking
if text.find("Java") != -1:
    print("Found Java")
else:
    print("Java not found")  # Output: Java not found
 
# Using index() when you're confident it exists
position = text.index("Python")  # We know Python is there
print(f"Found at position {position}")  # Output: Found at position 0

5.5.2) Finding from the End with rfind() and rindex()

The rfind() and rindex() methods search from the right (end) of the string:

python
# rfind_method.py
text = "Python is great. Python is powerful."
 
# Find last occurrence of "Python"
last_position = text.rfind("Python")
print(last_position)  # Output: 17
 
# Compare with find() which gives first occurrence
first_position = text.find("Python")
print(first_position)  # Output: 0

This is useful when you want the last occurrence of something:

python
# last_occurrence.py
filename = "document.backup.txt"
 
# Find the last period (to get file extension)
last_dot = filename.rfind(".")
if last_dot != -1:
    extension = filename[last_dot:]
    print(extension)  # Output: .txt

5.5.3) Counting Occurrences with count()

The count() method tells you how many times a substring appears:

python
# count_method.py
text = "Python is great. Python is powerful. Python is fun."
 
# Count how many times "Python" appears
count = text.count("Python")
print(count)  # Output: 3
 
# Count a character
letter_count = text.count("o")
print(f"Letter 'o' appears {letter_count} times")  # Output: Letter 'o' appears 4 times

You can also count within a specific range:

python
# count_in_range.py
text = "abcabcabc"
 
# Count "abc" in entire string
total = text.count("abc")
print(total)  # Output: 3
 
# Count "abc" only in first 6 characters
partial = text.count("abc", 0, 6)
print(partial)  # Output: 2

5.5.4) Replacing Text with replace()

The replace() method creates a new string with all occurrences of a substring replaced:

python
# replace_method.py
text = "I love Java. Java is great."
 
# Replace all occurrences of "Java" with "Python"
new_text = text.replace("Java", "Python")
print(new_text)  # Output: I love Python. Python is great.
 
# Original string is unchanged
print(text)  # Output: I love Java. Java is great.

You can limit the number of replacements with a third argument:

python
# limited_replace.py
text = "one one one one"
 
# Replace only the first 2 occurrences
result = text.replace("one", "two", 2)
print(result)  # Output: two two one one

The replace() method is case-sensitive:

python
# case_sensitive_replace.py
text = "Python is great. python is powerful."
 
# This only replaces "Python" (capital P)
result = text.replace("Python", "Java")
print(result)  # Output: Java is great. python is powerful.

For case-insensitive replacement, you need to handle it manually:

python
# case_insensitive_approach.py
text = "Python is great. python is powerful."
 
# Convert to lowercase, replace, but this loses original case
result = text.lower().replace("python", "java")
print(result)  # Output: java is great. java is powerful.

5.5.5) Practical Searching and Replacing Examples

Here are real-world scenarios where these methods shine:

python
# practical_search_replace.py
# Cleaning data: removing unwanted characters
phone = "123-456-7890"
clean_phone = phone.replace("-", "")
print(clean_phone)  # Output: 1234567890
 
# Censoring words
message = "This is a bad word and another bad word."
censored = message.replace("bad", "***")
print(censored)  # Output: This is a *** word and another *** word.
 
# Extracting file extension
filename = "document.txt"
dot_position = filename.rfind(".")
if dot_position != -1:
    extension = filename[dot_position + 1:]
    print(f"File type: {extension}")  # Output: File type: txt
 
# Counting word occurrences (simple approach)
text = "Python is fun. I love Python. Python rocks!"
word = "Python"
occurrences = text.count(word)
print(f"'{word}' appears {occurrences} times")  # Output: 'Python' appears 3 times

find

rfind

count

replace

String: 'Hello World Hello'

Search Operation

find('Hello')
Returns: 0

rfind('Hello')
Returns: 12

count('Hello')
Returns: 2

replace('Hello', 'Hi')
Returns: 'Hi World Hi'

First occurrence
from left

Last occurrence
from right

Total occurrences

New string with
replacements

These searching and replacing methods are fundamental tools for text processing, data cleaning, and string manipulation in Python programs.

5.6) Converting Between Strings and Numbers

One of the most common tasks in programming is converting between text and numeric representations. When you read user input with input(), you get a string — even if the user types a number. Similarly, when you want to display numbers in text, you need to convert them to strings.

5.6.1) Converting Strings to Numbers

We've already seen the int() and float() functions in Chapter 3, but let's explore them in more depth:

python
# string_to_number.py
# Converting string to integer
age_text = "25"
age = int(age_text)
print(age)        # Output: 25
print(type(age))  # Output: <class 'int'>
 
# Converting string to float
price_text = "19.99"
price = float(price_text)
print(price)        # Output: 19.99
print(type(price))  # Output: <class 'float'>

These conversions are essential when processing user input:

python
# user_input_conversion.py
# Simulating user input (in real code, you'd use input())
user_age = "30"
user_height = "5.9"
 
# Convert to numbers so we can do math
age = int(user_age)
height = float(user_height)
 
# Now we can perform calculations
print(f"In 10 years, you'll be {age + 10}")  # Output: In 10 years, you'll be 40
print(f"Your height in meters: {height * 0.3048:.2f}")  # Output: Your height in meters: 1.80

Important: The string must represent a valid number, or you'll get an error:

python
# conversion_errors.py
# These work fine
print(int("123"))      # Output: 123
print(float("3.14"))   # Output: 3.14
 
# These cause ValueError:
# print(int("hello"))     # ValueError: invalid literal for int()
# print(int("12.5"))      # ValueError: invalid literal for int() with base 10
# print(float("12.5.3"))  # ValueError: could not convert string to float

We'll learn how to handle these errors gracefully in Chapter 28. For now, be aware that conversion can fail if the string doesn't represent a valid number.

5.6.2) Handling Whitespace in Numeric Strings

Python's conversion functions automatically handle leading and trailing whitespace:

python
# whitespace_handling.py
# These all work fine despite the spaces
print(int("  42  "))    # Output: 42
print(float("  3.14  "))  # Output: 3.14
 
# Combining strip() with conversion for safety
user_input = "  100  "
number = int(user_input.strip())
print(number)  # Output: 100

This is helpful when processing user input, which often contains extra spaces.

5.6.3) Converting Numbers to Strings

The str() function converts any value to its string representation:

python
# number_to_string.py
age = 25
height = 5.9
 
# Convert numbers to strings
age_text = str(age)
height_text = str(height)
 
print(type(age_text))     # Output: <class 'str'>
print(type(height_text))  # Output: <class 'str'>
 
# Now we can concatenate them with other strings
message = "I am " + str(age) + " years old"
print(message)  # Output: I am 25 years old

This is necessary whenever you want to combine numbers with strings:

python
# concatenation_with_numbers.py
score = 95
total = 100
 
# Must convert numbers to strings for concatenation
result = "Score: " + str(score) + "/" + str(total)
print(result)  # Output: Score: 95/100
 
# Alternative: use f-strings (covered in detail in Chapter 6)
result = f"Score: {score}/{total}"
print(result)  # Output: Score: 95/100

5.6.4) Converting Between Integer and Float

You can also convert between integer and float types:

python
# int_float_conversion.py
# Float to int (truncates decimal part)
price = 19.99
price_int = int(price)
print(price_int)  # Output: 19 (decimal part removed, not rounded)
 
# Int to float
age = 25
age_float = float(age)
print(age_float)  # Output: 25.0

Important: Converting float to int truncates (cuts off) the decimal part — it doesn't round:

python
# truncation_not_rounding.py
print(int(3.9))   # Output: 3 (not 4!)
print(int(3.1))   # Output: 3
print(int(-3.9))  # Output: -3 (truncates toward zero)
 
# To round, use the round() function first (covered in Chapter 4)
print(int(round(3.9)))  # Output: 4

5.6.5) Practical Conversion Examples

Here are real-world scenarios where type conversion is essential:

python
# practical_conversions.py
# Reading and processing user input
# (Simulating input() - in real code, you'd use input())
user_input = "42"
 
# Convert to number to do calculations
number = int(user_input)
doubled = number * 2
print(f"Double of {number} is {doubled}")  # Output: Double of 42 is 84
 
# Building formatted output
name = "John"
age = 30
height = 5.9
 
# Method 1: Convert numbers to strings
info = name + " is " + str(age) + " years old and " + str(height) + " feet tall"
print(info)  # Output: John is 30 years old and 5.9 feet tall
 
# Method 2: Use f-strings (more readable - covered in Chapter 6)
info = f"{name} is {age} years old and {height} feet tall"
print(info)  # Output: John is 30 years old and 5.9 feet tall
 
# Processing data from files (preview)
data_line = "100,200,300"  # Simulating a line from a CSV file
numbers = data_line.split(",")  # Split into list of strings
total = int(numbers[0]) + int(numbers[1]) + int(numbers[2])
print(f"Total: {total}")  # Output: Total: 600

5.6.6) Common Conversion Pitfalls

Be aware of these common mistakes:

python
# conversion_pitfalls.py
# Pitfall 1: Trying to convert non-numeric strings
# text = "hello"
# number = int(text)  # ValueError!
 
# Pitfall 2: Forgetting to convert before arithmetic
age_text = "25"
# next_year = age_text + 1  # TypeError: can only concatenate str to str
 
# Correct approach:
age = int(age_text)
next_year = age + 1
print(next_year)  # Output: 26
 
# Pitfall 3: Losing precision with int()
price = 19.99
price_int = int(price)  # Becomes 19, not 20!
print(price_int)  # Output: 19
 
# Pitfall 4: Trying to convert strings with commas or currency symbols
# price_text = "$1,234.56"
# price = float(price_text)  # ValueError!
 
# You'd need to clean the string first:
price_text = "$1,234.56"
clean_price = price_text.replace("$", "").replace(",", "")
price = float(clean_price)
print(price)  # Output: 1234.56

Understanding type conversion is crucial for building programs that interact with users and process real-world data. You'll use these conversions constantly throughout your Python programming journey.

5.7) Checking Substrings with in and not in

Python provides simple and readable ways to check if one string contains another using the in and not in operators. These are incredibly useful for validation, filtering, and decision-making in your programs.

5.7.1) Using in to Check for Substrings

The in operator returns True if one string is found within another, and False otherwise:

python
# in_operator.py
text = "Python is a powerful programming language"
 
# Check if substring exists
print("Python" in text)      # Output: True
print("powerful" in text)    # Output: True
print("Java" in text)        # Output: False

This is much more readable than using find() or index():

python
# in_vs_find.py
text = "Hello, World!"
 
# Using in (clear and readable)
if "World" in text:
    print("Found World!")  # Output: Found World!
 
# Using find (less readable)
if text.find("World") != -1:
    print("Found World!")  # Output: Found World!

The in operator is case-sensitive:

python
# case_sensitivity.py
text = "Python Programming"
 
print("python" in text)   # Output: False (lowercase 'p')
print("Python" in text)   # Output: True (uppercase 'P')
 
# For case-insensitive checking, convert to lowercase first
print("python" in text.lower())  # Output: True

5.7.2) Using not in to Check for Absence

The not in operator checks if a substring is NOT present:

python
# not_in_operator.py
text = "Python is great"
 
print("Java" not in text)     # Output: True (Java is not there)
print("Python" not in text)   # Output: False (Python is there)

This is particularly useful for validation:

python
# validation_examples.py
# Checking for invalid characters in a username
username = "john_smith"
 
if " " not in username:
    print("Username is valid (no spaces)")  # Output: Username is valid (no spaces)

5.7.3) Additional String Checking Methods

Python provides several other useful methods for checking string properties:

python
# string_checking_methods.py
text = "Python"
 
# Check if string starts with a substring
print(text.startswith("Py"))   # Output: True
print(text.startswith("Ja"))   # Output: False
 
# Check if string ends with a substring
print(text.endswith("on"))     # Output: True
print(text.endswith("ing"))    # Output: False
 
# These are more precise than using in
filename = "report.txt"
print(filename.endswith(".txt"))  # Output: True
print(".txt" in filename)         # Output: True (but less precise)
 
# startswith/endswith can check multiple options
filename = "document.pdf"
print(filename.endswith((".pdf", ".doc", ".txt")))  # Output: True

These checking methods are essential tools for input validation, data filtering, and conditional logic in your programs. They make your code more readable and maintainable compared to manual string searching.

5.8) Strings Are Immutable: What That Means in Practice

One of the most important characteristics of Python strings is that they are immutable — once created, they cannot be changed. This might seem like a limitation at first, but understanding immutability is crucial for writing correct Python code and avoiding subtle bugs.

5.8.1) What Immutability Means

When we say strings are immutable, we mean you cannot modify the characters in an existing string. Any operation that seems to "change" a string actually creates a new string:

python
# immutability_basics.py
text = "Hello"
 
# This looks like it changes the string, but it doesn't
text = text + " World"
print(text)  # Output: Hello World
 
# What actually happened:
# 1. Python created a new string "Hello World"
# 2. The variable 'text' now refers to this new string
# 3. The original "Hello" string still exists (until garbage collected)

You cannot change individual characters in a string:

python
# cannot_modify_characters.py
text = "Hello"
 
# This causes an error:
# text[0] = "J"  # TypeError: 'str' object does not support item assignment
 
# You must create a new string instead
text = "J" + text[1:]
print(text)  # Output: Jello

This is fundamentally different from how lists work (which we'll learn about in Chapter 13). Lists are mutable — you can change their elements:

python
# lists_are_mutable.py
# Preview of lists (covered in Chapter 13)
numbers = [1, 2, 3]
numbers[0] = 10  # This works fine with lists
print(numbers)   # Output: [10, 2, 3]
 
# But strings don't allow this:
text = "Hello"
# text[0] = "J"  # TypeError with strings!

5.8.2) Why String Methods Return New Strings

All string methods that appear to modify a string actually return a new string, leaving the original unchanged:

python
# methods_return_new_strings.py
original = "hello world"
 
# These methods return new strings
uppercase = original.upper()
capitalized = original.capitalize()
replaced = original.replace("world", "Python")
 
# Original string is unchanged
print(original)      # Output: hello world
print(uppercase)     # Output: HELLO WORLD
print(capitalized)   # Output: Hello world
print(replaced)      # Output: hello Python

This is why you need to assign the result to a variable (or use the same variable) to keep the changes:

python
# keeping_changes.py
text = "  hello  "
 
# Wrong: result is lost
text.strip()
print(text)  # Output:   hello   (still has spaces!)
 
# Correct: assign the result
text = text.strip()
print(text)  # Output: hello (spaces removed)

This is a common mistake for beginners:

python
# common_mistake.py
message = "python programming"
 
# Mistake: calling method but not using the result
message.upper()
message.replace("python", "Python")
print(message)  # Output: python programming (unchanged!)
 
# Correct: assign results
message = message.upper()
message = message.replace("PYTHON", "Python")
print(message)  # Output: Python PROGRAMMING

5.8.3) Implications of Immutability

Understanding immutability helps you write better code:

1. Strings are safe to share:

python
# safe_sharing.py
original = "Hello"
copy = original  # Both variables point to the same string
 
# Since strings are immutable, this is safe
copy = copy + " World"
 
print(original)  # Output: Hello (unchanged)
print(copy)      # Output: Hello World (new string)

2. String operations create new objects:

python
# new_objects.py
text = "Python"
 
# Each operation creates a new string object
result1 = text.upper()
result2 = text.lower()
result3 = text.replace("P", "J")
 
# All different objects
print(id(text))     # Some memory address
print(id(result1))  # Different memory address
print(id(result2))  # Different memory address
print(id(result3))  # Different memory address

3. Building strings in loops can be inefficient:

python
# inefficient_string_building.py
# This creates many temporary string objects
result = ""
for i in range(5):
    result = result + str(i)  # Creates new string each time
print(result)  # Output: 01234
 
# More efficient approach (for many concatenations):
# Use a list and join (we'll learn this in Chapter 6)
parts = []
for i in range(5):
    parts.append(str(i))
result = "".join(parts)
print(result)  # Output: 01234

5.8.4) Immutability and Function Arguments

When you pass a string to a function, you don't need to worry about it being accidentally modified:

python
# safe_function_arguments.py
def process_text(text):
    # Any operations create new strings
    text = text.upper()
    text = text.replace("A", "X")
    return text
 
original = "banana"
result = process_text(original)
 
print(original)  # Output: banana (unchanged)
print(result)    # Output: BXNXNX (modified version)

This is different from mutable types (like lists, which we'll learn in Chapter 13), where modifications inside functions affect the original object.

5.8.5) Visualizing Immutability

Here's a visual representation of what happens when you "modify" a string:

text = 'Hello'

Memory: 'Hello' at address 1000

text = text + ' World'

What happens?

1. Create new string 'Hello World' at address 2000
2. Variable 'text' now points to address 2000
3. Original 'Hello' at address 1000 remains
until garbage collected

text now refers to 'Hello World'

Understanding that strings are immutable helps you:

  1. Avoid mistakes where you forget to capture method results
  2. Understand why string operations create new objects
  3. Write more efficient code when building large strings
  4. Safely share strings between different parts of your program

This immutability is a fundamental characteristic that distinguishes strings from mutable types like lists, which we'll explore in detail in Part IV of this book.


Chapter Summary:

In this chapter, you've learned the fundamentals of working with text in Python using strings. You now understand how to:

  • Create string literals using quotes and escape sequences
  • Combine strings with concatenation and repetition
  • Access individual characters and extract substrings with indexing and slicing
  • Transform strings using methods for case conversion and whitespace removal
  • Search for and replace text within strings
  • Convert between strings and numbers for input processing and output formatting
  • Check for substrings using in and not in operators
  • Recognize that strings are immutable and what that means for your code

These string manipulation skills form the foundation for text processing in Python. You'll use these techniques constantly when building user interfaces, processing data files, validating input, and formatting output.

In the next chapter, we'll build on these fundamentals by exploring more advanced string handling techniques, including splitting and joining strings, powerful formatting with f-strings and the format() method, and understanding text encoding to work with international characters.

© 2025. Primesoft Co., Ltd.
support@primesoft.ai