Binomial distribution

From Sustainability Methods
Revision as of 15:11, 14 March 2024 by Evgeniyaz (talk | contribs) (Created page with "THIS ARTICLE IS STILL IN EDITING MODE ==Definition== Binomial distribution is a probability distribution that summarises the likelihood that a variable will take one of two...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

THIS ARTICLE IS STILL IN EDITING MODE

Definition

Binomial distribution is a probability distribution that summarises the likelihood that a variable will take one of two independent values under a given set of parameters. The distribution is obtained by performing a number of Bernoulli trials.

A Bernoulli trial is assumed to meet each of these criteria:

  • There must be only 2 possible outcomes.
  • Each outcome has a fixed probability of occurring. A success has the probability of p, and a failure has the probability of 1 – p.

Each trial is completely independent of all others.

Binomial Distribution is a discrete distribution, whcih describes the outcome of binary scenerios. Discrete distribution is defined at separate set of events, e.g., a coin toss's result is discrete as it can be only head or tail, while the height of people is continuous, as it can be 170, 170.01, 170.11 and so on.

Binomial distribution will be quite similar to normal distribution with certain loc and scale, if there are enough data points.

It has three parameters:

  • n - number of trails;
  • p - probability of occurance of each trial;
  • size - the shape of the returned array.

Example

Consider a random experiment of tossing a biased coin 6 times, where the probability of getting a head is 0.6. If "getting a head" is considered as "success" then, the binomial distribution table will contain the probability of r successes P(r) for each possible value of r.

r 0 1 2 3 4 5 6
P(r) 0.004096 0.036864 0.138240 0.276480 0.311040 0.186624 0.046656

Binomial distribution in Python

Now, we will use Python to analyse the distribution and plot the graph, using Matplotlib.

from numpy import random
x = random.binomial(n=21, p=0.5, size=21)
print(x)

[10 7 10 11 8 10 12 14 11 7 8 6 8 12 14 8 11 14 16 11 10]

Visualization

import matplotlib.pyplot as plt
import seaborn as sns
sns.displot(random.binomial(n=10, p=0.5, size=100))
plt.show() # Figure 1

[[File:bin_distr_plot_1.png|550px]]<br>
''Figure 6: Binomial distribution plot''


==Python Calculation: Coin Flip Prediction==

===Using the math Library===

We can implement the binomial distribution of getting an even number of heads in 21 flips of unbiased coins using the math library in Python.

For each value of ''i'' in the range 0 to 22 (exclusive), with a step of 2, we can calculate the corresponding probability of getting ''i'' heads using the binomial distribution formula.

Inside the loop, we can first calculate the number of ways to choose ''i'' heads out of 21 flips using the <syntaxhighlight lang="Python" inline>math.comb()

function.

We can then calculate the probability of getting exactly i heads in 21 flips of an unbiased coin, which is (1/2)upperscr21. Finally, we can multiply the two values to obtain the probability of getting exactly i heads in 21 flips and store it in a list called probabilities.

After the loop finishes, we can compute the sum of all the probabilities, stored in the list probabilities using the sum() function and print the total probability of getting an even number of heads in 21 flips.

import math

# Create an empty list to store the probabilities of getting an even number of heads
probabilities = []

# Loop over all even numbers of heads that can be obtained in 21 flips
for i in range(0, 22, 2):
    # Calculate the number of ways to choose i heads out of 21 flips
    m = math.comb(21, i)
    # Calculate the probability of getting i heads in 21 flips
    n = (1/2 ** 21)
    prob = m * n 
    # Append the probability to the list
    probabilities.append(prob)

# Loop over the probabilities and print the probability of getting each even number of heads
for i, prob in enumerate(probabilities):
    print("The probability of getting {} heads in 21 flips is {}.".format(2*i, prob))

# Calculate the total probability 
total_prob = sum(probabilities)
print("Total probability of getting even number of heads is {}".format(total_prob))

The probability of getting 0 heads in 21 flips is 4.76837158203125e-07.
...
The probability of getting 10 heads in 21 flips is 0.16818809509277344.
...
Total probability of getting even number of heads is 0.5

Using the SciPy Library

This time, we will use the pre-built function to calculate the probability.


1. Import the binom() function from the scipy.stats module. 2. Define the binomial distribution with n trials and probability p of success using the binom() function.

We calculate the probability of getting an even number of heads (0, 2, 4, …, 20) using the pmf() method of the binomial distribution. It returns the probability mass function (PMF) of the distribution for each value in the given range.

Then we implement another for-loop to print the probability of getting each even number of heads in the list even_probs. Finally, we calculate the total probability of getting an even number of heads in 21 flips by summing up the probabilities in the list even_probs.

from scipy.stats import binom

n = 21
p = 0.5

# Define the binomial distribution with n trials and probability p of success
binom_dist = binom(n, p)

# Calculate the probability of getting an even number of heads (0, 2, 4, ..., 20)
even_probs = binom_dist.pmf(range(0, 22, 2))

# Print the probability of getting each even number of heads
for i, prob in enumerate(even_probs):
    print("The probability of getting {} heads in 21 flips is {}".format(2*i,prob))

# Calculate the total
total_prob = sum(even_probs)
print("Total probability of getting even number of heads is {}".format(total_prob))

The probability of getting 0 heads in 21 flips is 4.768371582031256e-07
...
The probability of getting 10 heads in 21 flips is 0.16818809509277355
...
Total probability of getting even number of heads is 0.5000000000000002

Scipy library provides a more efficient and convenient calculation method. It allows us to define and calculate the distribution in just a few lines of code.

Why are the Results (Slightly) Different?

The difference between the SciPy and math calculations is due to floating-point precision errors. In computer programming, floating-point numbers are represented in a limited number of bits, which can cause a loss of precision when performing calculations. This is why we are getting slightly different values for the probabilities of getting an even number of heads when comparing the results of the two methods.

Both methods provide a good approximation of the true probability, and the difference between the two is negligible for most practical purposes.


The author of this entry is Raj Chaudhari.