Data Types in Python

From Sustainability Methods
Revision as of 22:05, 17 September 2024 by Gustavo (talk | contribs) (→‎(3.4) Booleans)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Purpose of data types

Data is pieces of information that are recorded or measured. This can be a number, text,... We need different data types to represent these different types of information. For example, how can we measure a school class? With the names of the students (= text)? With the amount of students (= a number)? How can we describe who is present today and who is not (= a value that is either True or False)?

Overview of basic data types

To represent these different types of information, there are different data types in Python which have specific characteristics and limitations. The four basic data types are summarized in the following table.

Data type Example Example value Description
Integer Number of students 2 Whole numbers
Float Height in meters 2.6 Decimal numbers
String Name of a student 'Sam' Text/letter enclosed in quotes
Boolean Present or not True True or False values

When we assign a value to a variable, Python automatically recognizes what type of value it is based on certain characteristics. If the assigned number has a ., the value must be a Float. If there is no . in a number, Python recognizes an Integer accordingly. If something is enclosed in quotes (" or '), it must be a String.

Attention: Often it seems intuitive for humans which data type we are dealing with. Python's interpretation does not always align with this. For example, we recognize '20' as an integer, but Python interprets it as a string because it is enclosed in quotation marks.

Basic data types and operations

Depending on what data type we are dealing with, there are different things that can be done with it. Most of the limitations are self-explanatory if you think about it. But let's take a look at the different data types.

Integers

Integers are positive and negative whole numbers without decimal point. Therefore, we can perform the following operations +,-,*, /, //, % and ** with them.

  • Python uses the percent sign % as a modulo operator. The modulo operation returns the remainder that results from dividing two numbers.
  • // is a special division operator called floor division. The floor division operator (also called integer division) divides the number a by the number b and discards the remainder of the division (i.e., rounds off the division result).
  • The operator ** can be used to calculate the power of two numbers.
number1 = 2
number2 = 10
print(number1 + number2)  # Adds the two numbers
print(number2 - number1)  # Subtracts the two numbers
print(number1 * number2)  # Multiplies the two numbers
print(number2 / number1)  # Divides the two numbers
print(number2 // number1)  # Find the floor division
print(number2 % number1)  # Calculate the remainder
print(number2 ** number1)  # Calculates number2 to the power of number1
12
8
20
5.0
5
0
100

Floats

Floats are all numbers that contain a decimal point. All operations discussed in the section on integers can also be applied to floats.

Note: The decimal point has to be a point, not a comma.

Note: Any arithmetic operation with floating point numbers produces a floating point value as a result. Furthermore, arithmetic operations with two integers that result in decimal numbers, e.g. `result = 3 / 2`, also produce a float value.

float1 = 2.7
float2 = 1.3

print(float1 + float2)  # Adds the two numbers
print(float2 - float1)  # Subtracts the two numbers
print(float1 * float2)  # Multiplies the two numbers
print(float2 / float1)  # Divides the two numbers
print(float2 // float1)  # Find the floor division
print(float2 ** float1)  # Calculates the float2 to the power of float1
4.0
-1.4000000000000001
3.5100000000000002
0.48148148148148145
0.0
2.0307059963850898

Caution: Why does the second operation (subtraction "Float2 - Float1") not result in the value 1.4?

The computer performs all calculations with binary numbers (0 and 1) and then converts them back into decimal numbers. There is usually no exact binary representation for floating point decimal values. This is a side effect of how the Central Processing Unit (CPU) represents floating point data. Because of this, there may be some loss of precision and some floating point operations may produce unexpected results. For this reason, when calculating "float1 - float2", the correct answer can only be approximated. When using floats, we should always keep these approximation errors in mind!

Note: One way to deal with these approximation errors is to round the results when calculating with floats.

We can round float numbers to any number of decimal places. This is done with the Python function round(). The function round() takes two arguments: The first argument is the number to be rounded, and the second argument is the desired number of decimal places.

Take a look at the following example.

float_3 = float2 - float1

print(float_3)
print(round(float_3, 1))
-1.4000000000000001
-1.4

Strings

Strings are defined as a group of characters, symbols, numbers and spaces enclosed in quotes. For Python, it doesn't matter whether we use single ' or double " quotes when marking strings, but you should do so consistently.

The two most important operations for strings are concatenation (joining) with the + operator and repeating with the * operator. If you think about it, other operations (-, /, ...) don't really make sense either.

word_1 = ''  # This is an empty string
word_2 = 'Hello '
word_3 = 'Friend '

print(word_1)
print(word_2)
print(word_3)
Hello 
Friend 

The concatenation of two strings looks like this:

print(word_2 + word_3)
Hello Friend 

Repeating a string again like this:

print(word_2 * 3)
Hello Hello Hello 

Booleans

Booleans are a special data type in programming that knows only two values: either something is True or False - there is no in-between. Booleans are the basis for making decisions and controlling the program flow. As soon as you want a program to do different things depending on certain conditions, this kind of value becomes extremely important.

bool_0 = False
bool_1 = True

print(bool_0)
print(bool_1)
False
True

Type checking and Type casting

As you might have noticed already, some commands only work with some data types. For example, it is not possible to do arithmetic operations with an integer and a string.

a = 3
b = "3" 
c = a + b # error
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [6], in <cell line: 3>()
      1 a = 3
      2 b = "3" 
   > 3 c = a + b # error

TypeError: unsupported operand type(s) for +: 'int' and 'str'

It is also not possible to concatenate a string and an integer.

name = "Sam"
age = 33
print(name + "-" + age) # error
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [7], in <cell line: 3>()
      1 name = "Sam"
      2 age = 33
> 3 print(name + "-" + age) # error

TypeError: can only concatenate str (not "int") to str

Python detects the data types by itself, but as the previous examples show this is sometimes not the data type that you need. Therefore, it is important to know which data types are involved in your analysis and to be able to change them.

With the so-called type checking we can always check the data type of a variable and with the so-called type casting we can (often) change the data type of a variable into a desired other data type.

The built-in function type() helps us to check the data type of a value in Python. So, we can check the data types from the example above:

print(type(a))
print(type(b))
print(type(name))
print(type(age))
<class 'int'>
<class 'str'>
<class 'str'>
<class 'int'>

We can see that we tried to add a string and an integer which is not possible. In the second example we tried to concatenate a string and an integer which is not possible either. To do these operations we first need to change the data types.

The following built-in functions help us to change the data types of values.

function use
int() converts a value to an integer value
float() converts a value to a float value
str() converts a value to a string value
bool() converts a value to a boolean value

If we convert b into an integer, the code runs without problems:

b = int(b)
c = a + b

To concatenate name and age, we need to convert age into a string:

age = str(age)
print(name + "-" + age)
Sam-33

The following example shows another situation, in which you need typecasting.

# Example for Type-Casting

'''
1. Initial situation: We have a predefined variable 'amount'. We want to add 5 to this variable!
2. Problem Statement: We figured out that this is a non feasible computational operation.
                      Since the value of 'amount' is a string.
3. Solution approach: We have to typecast the variable 'amount' as to be an Integer.
                      Then we should be able to perform the addition.
'''

# 1.
amount = '1004'

# 2.
# print(amount + 5) # If you take away the first '#' you'll see that this is a non feasible computational operation.
print(type(amount))

# 3.
amount = int(amount)
print(type(amount))
print(amount + 5)
<class 'str'>
<class 'int'>
1009

Caution: You cannot convert a string of non-numeric characters into an integer or float! However, it is possible to convert integers and floats into strings. Feel free to try it out on your own.

Summary

  1. You have learned a variety of data types such as strings, booleans, integers, and floats. These data types are the basis for representing different kinds of data like texts, numbers and truth values.
  2. Depending on the data type different operations can be done with it, so it is important to select the right data type.
  3. Type checking and casting helps to check and change data types.

References

To be added soon

The authors of this entry are Wanja Tolksdorf and Gustavo Rodriguez