Sequential data types (lists)

From Sustainability Methods

Sequential data types (lists)

In addition to the basic data types, Python (and most other programming languages) has data types that allow us to manage a larger amount of data. An example of this is the sequential data types, where the order of the data points plays an essential role. With a string, as well as with the data type list, which we will consider in a moment, an additional information content lies in the fixed order of the individual characters.

For example, take any word: "Leuphana Semester". If the characters were not in this fixed order, we would lose information, e.g. Suphaa Leemestern would hardly be comparable to the original string, although it consists of the same characters. The position of the individual characters/data points thus plays a decisive role.

With the sequential data type list it behaves exactly the same. The only difference is that lists offer the possibility to combine several data points of different types separately.

An example: We can keep different fruits including their health scores in a list:

fruit_health_scores = ['oranges', 20, 'banana', 12, 'cherries', 10.5, 'apples', 30, 'melon', 8]

The list allows to keep both the fruit name (string) and its health score (float) in one data structure. As you can imagine, however, the order plays a decisive role here as well. If, for example, the health scores were swapped, they would no longer correspond to the actual score of the fruit. Information would be lost.

Note: To create a list, we need to enclose a set of elements separated by a comma in square brackets.

fruits = ['oranges', 'banana', 'cherries', 'apples', 'melon']
scores = [20, 12, 10.5, 30, 8]
fruit_health_scores = ['oranges', 20, 'banana', 12, 'cherries', 10.5, 'apples', 30, 'melon', 8]

Note: A list can contain different types of data, including other lists

complex_list = [23, 'k95', False, 1.3, [1, 2, 3]]  # Note, that the last item is a list (in a list)

Indexing

As we just learned, the order of individual data points is very important in sequential data types (strings and lists). Each character in a string and each element in a list has a unique position. This position is called index. The first element in a sequence has index 0, the second index 1, the third index 2, and so on.

The following figure shows how indexing works using a string as an example:

example = "Monty Python"
print(example[0:4])
print(example[5])
print(example[-6:-2])
Mont
 Pyth

The interesting thing about indexes is that we are now able to selectively pick out individual elements from sequences. The following example shows how this works for a list.

fruits = ['oranges', 'banana', 'cherries', 'apples', 'melon']
print(fruits[0])
print(fruits[3])
oranges
apples

Alternatively, for sequences, we can use negative indices to access individual elements. The last element in the sequence has the index -1, the second to last the index -2 and so on. Now let's look at some examples.

fruits = ['oranges', 'banana', 'cherries', 'apples', 'melon']
# This returns the last item in the list.
print('Last element:',fruits[-1])

# This returns the last but one item in the list.
print('Last but one element:',fruits[-2])

# This returns the first element
print('First element:',fruits[-5])
Last element: melon
Last but one element: apples
First element: oranges

Note: As you can see here, we can also use the print() function to output several things at once - in this case a string and the selected item from the list. Both must be separated by a comma inside the print() function.

Slicing

Slicing is the process of taking multiple data points from an original sequence (e.g. a list or a string) using the indexes.

As the graphic used above suggests, the syntax for slicing a sequence in Python is as follows:

sequence[start(included):stop(excluded):increment].

Note: The above code selects elements in a sequence that starts with index `start` and ends with index `stop-1`. This means that the element with index `stop` is excluded.

Note: Unless otherwise defined, the default step size is 1.

Let's take a look at a few examples.

smoothy_ingredients = ['kiwi', 'lemon', 'passion fruit', 'banana', 'mango']
# Guess the output
sour_smoothy = smoothy_ingredients[0:2]
print(sour_smoothy)
['kiwi', 'lemon']
# Guess the output
sweet_smoothy = smoothy_ingredients[3:5]
print(sweet_smoothy)


['banana', 'mango']

Note: If we do not specify the start and stop limits, the start limit is automatically `0` and the stop limit is automatically the length of the sequence, i.e. the number of elements in the sequence.

# Guess the output
mysterious_smoothy = smoothy_ingredients[:]
print(mysterious_smoothy)


['kiwi', 'lemon', 'passion fruit', 'banana', 'mango']

Now let's test the optional argument "step size" - this value specifies how many indexes should be skipped when selecting the subset.

# Guess the output
halved_smoothy = smoothy_ingredients[:5:2]
print(halved_smoothy)
['kiwi', 'passion fruit', 'mango']

Note: The step size can also assume negative values. These then reverse the order of the sequence.

Example:

reversed_smoothy = smoothy_ingredients[::-1]
print(smoothy_ingredients)
print(reversed_smoothy)
['kiwi', 'lemon', 'passion fruit', 'banana', 'mango']
['mango', 'banana', 'passion fruit', 'lemon', 'kiwi']

As you can see from the example of "halved_smoothy", we can use all three arguments (start, stop and step) simultaneously and thus select complex subsets with comparable ease. Often such an approach then seems particularly elegant to us.

Note: In programming, however, this "complexity-hiding elegance" is often deceptive and one risks to trick oneself. It often makes more sense to divide one's approach (algorithms) into individual, easily understandable code steps. After all, the code we write is meant to be read by humans. The computer translates everything into 0s and 1s anyway.

You think complex thinking and combining multiple steps are your thing? Then try the next task!

Special operations on lists

The Python programming language has some standard functions that are already predefined. Among them are functions that help us to implement typical operations on lists in a simple way.

In the following, we will limit ourselves to three selected, ready-made functions for manipulating lists:

  1. sorting the elements of a list with the sort() method.
  2. adding elements to a list with the append() method.
  3. removing elements from a list with the remove() method or the del keyword.

Sorting a list

When working with lists of any kind, sorting is a common task. The ready-made sort() method can save you a lot of work.

# Let's create two random lists with strings and numbers
names_list = ['Aurora', 'Hazel', 'Theodore', 'Jasper', 'Silas', 'Oliver', 'Violet', 'Charlotte', 'John', 'Alfred']
numbers_list = [1.3, 2.7, 3.3, 3.3, 5.0, 1.0, 1.7, 2.0, 3.0, 2.7, 3.0, 2.3, 2.3, 4.0, 3.7, 1.7, 2.0, 1.3, 2.7, 2.3, 2.0]
print(names_list)
print(numbers_list)
['Aurora', 'Hazel', 'Theodore', 'Jasper', 'Silas', 'Oliver', 'Violet', 'Charlotte', 'John', 'Alfred']
[1.3, 2.7, 3.3, 3.3, 5.0, 1.0, 1.7, 2.0, 3.0, 2.7, 3.0, 2.3, 2.3, 4.0, 3.7, 1.7, 2.0, 1.3, 2.7, 2.3, 2.0]
# One applies the sorting method directly to the original list. Be sure to note that this changes the original list!
names_list.sort()
numbers_list.sort()
print(names_list)
print(numbers_list)
['Alfred', 'Aurora', 'Charlotte', 'Hazel', 'Jasper', 'John', 'Oliver', 'Silas', 'Theodore', 'Violet']
[1.0, 1.3, 1.3, 1.7, 1.7, 2.0, 2.0, 2.0, 2.3, 2.3, 2.3, 2.7, 2.7, 2.7, 3.0, 3.0, 3.3, 3.3, 3.7, 4.0, 5.0]

Let's test the limits of the sort() method: is it possible to sort mixed lists?

mixed_list = ['Sweden', 'Spain', 'France', 'Poland', 10099265, 46794470, 65273511, 37755769]
print(mixed_list)
['Sweden', 'Spain', 'France', 'Poland', 10099265, 46794470, 65273511, 37755769]
mixed_list.sort()
print(mixed_list)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [31], in <cell line: 1>()
> 1 mixed_list.sort()
      2 print(mixed_list)

TypeError: '<' not supported between instances of 'int' and 'str'

Caution: Lists that combine strings and numeric data (integers/floats) cannot be sorted with the `sort()` method.

Adding an element to a list

Often you need to add or delete items. In the following, we will look at both.

You can add an element to a list using the append() method.

print(names_list)
names_list.append('Charlie')
print(names_list)
['Alfred', 'Aurora', 'Charlotte', 'Hazel', 'Jasper', 'John', 'Oliver', 'Silas', 'Theodore', 'Violet']
['Alfred', 'Aurora', 'Charlotte', 'Hazel', 'Jasper', 'John', 'Oliver', 'Silas', 'Theodore', 'Violet', 'Charlie']

Note: The `append()` method adds the new entry to the very end of the list.

print(numbers_list)
unreported_absent = [5.0, 5.0]
numbers_list.append(unreported_absent)
print(numbers_list)
[1.0, 1.3, 1.3, 1.7, 1.7, 2.0, 2.0, 2.0, 2.3, 2.3, 2.3, 2.7, 2.7, 2.7, 3.0, 3.0, 3.3, 3.3, 3.7, 4.0, 5.0]
[1.0, 1.3, 1.3, 1.7, 1.7, 2.0, 2.0, 2.0, 2.3, 2.3, 2.3, 2.7, 2.7, 2.7, 3.0, 3.0, 3.3, 3.3, 3.7, 4.0, 5.0, [5.0, 5.0]]

Note:

With the `append()` method you can only add single elements! However, a list also counts as one element.

Removing an element from a list

When we want to remove a value from a list, we can - as so often in programming - proceed in several ways. We can remove an element by specifying the value of the element to be removed (using remove()), or we can delete the element by specifying the index of the value to be removed (using del).

Let's look at some examples.

# We remove an item by value using the `remove()` method.
regional_shops = ['Budnikowsky', 'Aldi', 'Lidl', 'Lidl', 'Penny']
regional_shops.remove('Lidl')
print(regional_shops)


['Budnikowsky', 'Aldi', 'Lidl', 'Penny']

Note: In case of duplicates, Python removes the first occurrence of the value, i.e. the value with the lowest index when we use the `remove()` method.

# We remove an item by index using the `del` keyword.
regional_shops = ['Budnikowsky', 'Aldi', 'Lidl', 'Lidl', 'Penny']
del regional_shops[-1]
print(regional_shops)
['Budnikowsky', 'Aldi', 'Lidl', 'Lidl']

Summary

  1. For sequential data types, such as strings and lists, the order of elements plays a large role.
  2. Elements of sequences can be accessed via their index.
  3. Multiple elements of sequences can be accessed with slicing.
  4. The built-in Python functions sort(), append(), remove(), and del enable sorting, adding and deleting elements in lists.

References

To be added soon


The author of this entry is Wanja Tolksdorf.