String Operations

From Sustainability Methods
Revision as of 09:39, 18 September 2024 by Gustavo (talk | contribs)

In Python, strings are a sequence of characters used to store and manipulate text. Python provides many built-in methods to perform common string operations, which are especially useful in data analysis for cleaning, transforming, and analyzing text data. Below are some of the most commonly used string operations.

To apply any of these methods, you have to assign the string to a variable, and then use one of respective methods below , for example: .lower(), .len(), .strip().

1. Converting to Lowercase and Uppercase

You can convert a string to all lowercase or uppercase characters using the lower() and upper() methods.

text = "Wiki Methods"
print(text.lower())  
print(text.upper())

The output will be respectively:

wiki methods
WIKI METHODS

2. Counting Occurrences

To count how many times a specific substring appears in a string, use the count() method.

text = "bananas"
print(text.count("a"))

The output will be:

3

3. Finding the Length of a String

The len() function returns the number of characters in a string, including white spaces.

text = "Data Analysis"
print(len(text))

The output will be:

13

4. Accessing Characters by Index

In Python, strings are sequences of characters, and each character in the string has an associated index. The index starts from 0 for the first character and increases by 1 for each subsequent character. You can access individual characters in a string using square brackets [], followed by the index of the character you want to access.

Positive Indexing

Indexing starts at 0 for the first character, 1 for the second, and so on.

text = "Python"
print(text[0])  # Output: "P"
print(text[3])  # Output: "h"

In this example, text[0] returns the first character "P", and text[3] returns the fourth character "h".

Negative Indexing

Python also supports negative indexing, where -1 refers to the last character, -2 refers to the second-to-last character, and so on.

text = "Python"
print(text[-1])  # Output: "n"
print(text[-2])  # Output: "o"

In this example, text[-1] returns the last character "n", and text[-2] returns the second-to-last character "o".

IndexError

If you try to access an index that is out of the range of the string, Python will raise an IndexError.

text = "Python"
print(text[10])  # Raises IndexError: string index out of range

Slicing Strings

You can access a range of characters by using slicing. The syntax for slicing is start:end, where start is the index to begin slicing (inclusive) and end is the index where slicing stops (exclusive).

text = "Python"
print(text[0:2])  # Output: "Py"
print(text[2:5])  # Output: "tho"

Slicing also supports negative indexing and allows you to skip characters by providing a step value (start:end:step).

text = "Python"
print(text[::2])  # Output: "Pto"

In this example, text[::2] starts at the beginning and takes every second character (P, t, o).

5. Finding Substrings

To find the position of a substring within a string, use the find() method. It returns the index of the first occurrence or -1 if the substring is not found.

text = "data science"
print(text.find("science"))  # Output: 5
print(text.find("math"))     # Output: -1

6. Replacing Substrings

The replace() method replaces occurrences of a substring with another.

text = "I love Python"
print(text.replace("Python", "Data Science"))  # Output: "I love Data Science"

7. Splitting Strings

The split() method splits a string into a list of substrings based on a specified delimiter (default is space).

text = "apple, banana, cherry"
print(text.split(", "))  # Output: ['apple', 'banana', 'cherry']

8. Stripping Whitespace

The strip() method removes leading and trailing whitespace from a string.

text = "   Hello World   "
print(text.strip())  # Output: "Hello World"

9. f-Strings for Formatting

An f string allows you to insert variables directly into a string. It is a modern and efficient way of formatting strings.

name = "Alice"
age = 30
print(f"My name is {name} and I am {age} years old.")
# Output: "My name is Alice and I am 30 years old."