String Operations
In Python, strings are a sequence of characters used to store and manipulate text. Python provides many built-in methods to perform common string operations, which are especially useful in data analysis for cleaning, transforming, and analyzing text data. Below are some of the most commonly used string operations.
Contents
1. Converting to Lowercase and Uppercase
You can convert a string to all lowercase or uppercase characters using the lower() and upper() methods.
text = "Wiki Methods" print(text.lower()) print(text.upper())
The output will be respectively:
wiki methods WIKI METHODS
2. Counting Occurrences
To count how many times a specific substring appears in a string, use the count() method.
text = "bananas" print(text.count("a"))
The output will be:
3
3. Finding the Length of a String
The len() function returns the number of characters in a string.
text = "Data" print(len(text)) # Output: 4
4. Accessing Characters by Index
In Python, strings are sequences of characters, and each character in the string has an associated index. The index starts from 0 for the first character and increases by 1 for each subsequent character. You can access individual characters in a string using square brackets [], followed by the index of the character you want to access.
Positive Indexing
Indexing starts at 0 for the first character, 1 for the second, and so on.
text = "Python" print(text[0]) # Output: "P" print(text[3]) # Output: "h"
In this example, text[0] returns the first character "P", and text[3] returns the fourth character "h".
Negative Indexing
Python also supports negative indexing, where -1 refers to the last character, -2 refers to the second-to-last character, and so on.
text = "Python" print(text[-1]) # Output: "n" print(text[-2]) # Output: "o"
In this example, text[-1] returns the last character "n", and text[-2] returns the second-to-last character "o".
IndexError
If you try to access an index that is out of the range of the string, Python will raise an IndexError.
text = "Python" print(text[10]) # Raises IndexError: string index out of range
Slicing Strings
You can access a range of characters by using slicing. The syntax for slicing is start:end, where start is the index to begin slicing (inclusive) and end is the index where slicing stops (exclusive).
text = "Python" print(text[0:2]) # Output: "Py" print(text[2:5]) # Output: "tho"
Slicing also supports negative indexing and allows you to skip characters by providing a step value (start:end:step).
text = "Python" print(text[::2]) # Output: "Pto"
In this example, text[::2] starts at the beginning and takes every second character (P, t, o).
5. Finding Substrings
To find the position of a substring within a string, use the find() method. It returns the index of the first occurrence or -1 if the substring is not found.
text = "data science" print(text.find("science")) # Output: 5 print(text.find("math")) # Output: -1
6. Replacing Substrings
The replace() method replaces occurrences of a substring with another.
text = "I love Python" print(text.replace("Python", "Data Science")) # Output: "I love Data Science"
7. Splitting Strings
The split() method splits a string into a list of substrings based on a specified delimiter (default is space).
text = "apple, banana, cherry" print(text.split(", ")) # Output: ['apple', 'banana', 'cherry']
8. Stripping Whitespace
The strip() method removes leading and trailing whitespace from a string.
text = " Hello World " print(text.strip()) # Output: "Hello World"
9. f-Strings for Formatting
An f string allows you to insert variables directly into a string. It is a modern and efficient way of formatting strings.
name = "Alice" age = 30 print(f"My name is {name} and I am {age} years old.") # Output: "My name is Alice and I am 30 years old."