Difference between revisions of "Binomial distribution"

From Sustainability Methods
(Created page with "THIS ARTICLE IS STILL IN EDITING MODE ==Definition== Binomial distribution is a probability distribution that summarises the likelihood that a variable will take one of two...")
 
 
(4 intermediate revisions by one other user not shown)
Line 3: Line 3:
 
==Definition==
 
==Definition==
  
Binomial distribution is a probability distribution that summarises the likelihood that a variable will take one of two independent values under a given set of parameters. The distribution is obtained by performing a number of Bernoulli trials.
+
Binomial distribution is a probability distribution that summarises the likelihood that a variable will take one of two independent values under a given set of parameters. The distribution is obtained by performing a number of Bernoulli trials. Each trial is completely independent of all others.
  
A Bernoulli trial is assumed to meet each of these criteria:<br>
+
A Bernoulli trial is assumed to meet each of these criteria:
* There must be only 2 possible outcomes.
+
* there must be only 2 possible outcomes;
* Each outcome has a fixed probability of occurring. A success has the probability of '''p''', and a failure has the probability of '''1 – p'''.
+
* each outcome has a fixed probability of occurring. A success has the probability of '''p''', and a failure has the probability of '''1 – p'''.
Each trial is completely independent of all others.
 
  
Binomial Distribution is a discrete distribution, whcih describes the outcome of binary scenerios. Discrete distribution is defined at separate set of events, e.g., a coin toss's result is discrete as it can be only head or tail, while the height of people is continuous, as it can be 170, 170.01, 170.11 and so on.  
+
Binomial Distribution is a discrete distribution, whcih describes the outcome of binary scenerios. Discrete distribution is defined at separate set of events, e.g., a coin toss's result is discrete as it can be only head or tail, while the people`s height is continuous, as it can be 170, 170.01, 170.11 and so on. Binomial distribution will be quite similar to normal distribution with certain loc and scale, if there are enough data points.
  
Binomial distribution will be quite similar to normal distribution with certain loc and scale, if there are enough data points.
+
===Binomial Distribution Formula===
  
It has three parameters:<br>
+
There is a Binomial distribution formula. With its help we can calculate the probability, that the event will occur '''k''' times in '''n''' number of trails.
* '''n''' - number of trails;
+
 
* '''p''' - probability of occurance of each trial;
+
[[File:bern_formula1.png|300px|center]]
* '''size''' - the shape of the returned array.
+
 
 +
where '''p''' - a probability of success, '''1 – p''' - a probability of failure, '''n''' - number of trails, '''k''' - number of successes and
 +
[[File:bern_form2.png|120px]]
  
 
===Example===
 
===Example===
  
Consider a random experiment of tossing a ''biased'' coin 6 times, where the probability of getting a head is 0.6. If ''"getting a head"'' is considered as ''"success"'' then, the binomial distribution table will contain the probability of '''r''' successes '''P(r)''' for each possible value of '''r'''.
+
Consider a random experiment of tossing a ''biased'' coin 6 times, where the probability of getting a head is ''0.6''. If ''"getting a head"'' is considered as ''"success"'' then, the binomial distribution table will contain the probability of '''r''' successes '''P(r)''' for each possible value of '''r'''. By substituting into the formula (1) ('''p=0.6''', '''1-p=0.4''', '''n=6''', '''k=(0,1,2,3,4,5,6)'''), we will get the following table of the binomial distribution:
  
 
{| class="wikitable"
 
{| class="wikitable"
Line 31: Line 32:
 
==Binomial distribution in Python==
 
==Binomial distribution in Python==
  
Now, we will use Python to analyse the distribution and plot the graph, using Matplotlib.
+
Now, we will use Python to analyse the distribution and plot the graph.
  
 
<syntaxhighlight lang="Python" line>
 
<syntaxhighlight lang="Python" line>
Line 39: Line 40:
 
</syntaxhighlight>
 
</syntaxhighlight>
 
[10 7 10 11 8 10 12 14 11 7 8 6 8 12 14 8 11 14 16 11 10]
 
[10 7 10 11 8 10 12 14 11 7 8 6 8 12 14 8 11 14 16 11 10]
 +
 +
<syntaxhighlight lang="Python" inline>random.binomial</syntaxhighlight> has three parameters:
 +
* '''n''' - number of trails;
 +
* '''p''' - probability of occurance of each trial;
 +
* '''size''' - the shape of the returned array.
 +
 +
It returns the array with number of success in n=21 trials. More information about this functions can be found [https://numpy.org/doc/stable/reference/random/generated/numpy.random.binomial.html here].
  
 
===Visualization===
 
===Visualization===
Line 47: Line 55:
 
sns.displot(random.binomial(n=10, p=0.5, size=100))
 
sns.displot(random.binomial(n=10, p=0.5, size=100))
 
plt.show() # Figure 1
 
plt.show() # Figure 1
 +
</syntaxhighlight>
  
 
[[File:bin_distr_plot_1.png|550px]]<br>
 
[[File:bin_distr_plot_1.png|550px]]<br>
''Figure 6: Binomial distribution plot''
+
''Figure 1: Binomial distribution plot''
 
 
  
 
==Python Calculation: Coin Flip Prediction==
 
==Python Calculation: Coin Flip Prediction==
Line 57: Line 65:
  
 
We can implement the binomial distribution of getting an even number of heads in 21 flips of unbiased coins using the math library in Python.
 
We can implement the binomial distribution of getting an even number of heads in 21 flips of unbiased coins using the math library in Python.
 
 
For each value of ''i'' in the range 0 to 22 (exclusive), with a step of 2, we can calculate the corresponding probability of getting ''i'' heads using the binomial distribution formula.
 
For each value of ''i'' in the range 0 to 22 (exclusive), with a step of 2, we can calculate the corresponding probability of getting ''i'' heads using the binomial distribution formula.
  
Inside the loop, we can first calculate the number of ways to choose ''i'' heads out of 21 flips using the <syntaxhighlight lang="Python" inline>math.comb()</syntaxhighlight> function.
+
Inside the loop, we can first calculate the number of ways to choose ''i'' heads out of 21 flips using the <syntaxhighlight lang="Python" inline>math.comb()</syntaxhighlight> function. We can then calculate the probability of getting exactly ''i'' heads in 21 flips, which is (1/2)<sup>21</sup>. Finally, we can multiply the two values to obtain the probability of getting exactly ''i'' heads in 21 flips and store it in a list called <syntaxhighlight lang="Python" inline>probabilities</syntaxhighlight>.
We can then calculate the probability of getting exactly ''i'' heads in 21 flips of an unbiased coin, which is (1/2)upperscr21. Finally, we can multiply the two values to obtain the probability of getting exactly ''i'' heads in 21 flips and store it in a list called <syntaxhighlight lang="Python" inline>probabilities</syntaxhighlight>.
 
  
 
After the loop finishes, we can compute the sum of all the probabilities, stored in the list <syntaxhighlight lang="Python" inline>probabilities</syntaxhighlight> using the <syntaxhighlight lang="Python" inline>sum()</syntaxhighlight> function and print the total probability of getting an even number of heads in 21 flips.
 
After the loop finishes, we can compute the sum of all the probabilities, stored in the list <syntaxhighlight lang="Python" inline>probabilities</syntaxhighlight> using the <syntaxhighlight lang="Python" inline>sum()</syntaxhighlight> function and print the total probability of getting an even number of heads in 21 flips.
Line 100: Line 106:
 
This time, we will use the pre-built function to calculate the probability.
 
This time, we will use the pre-built function to calculate the probability.
  
 
+
1. Import the <syntaxhighlight lang="Python" inline>binom()</syntaxhighlight> function from the <syntaxhighlight lang="Python" inline>scipy.stats</syntaxhighlight> module.<br>
1. Import the <syntaxhighlight lang="Python" inline>binom()</syntaxhighlight> function from the <syntaxhighlight lang="Python" inline>scipy.stats</syntaxhighlight> module.
 
 
2. Define the binomial distribution with '''n''' trials and probability '''p''' of success using the <syntaxhighlight lang="Python" inline>binom()</syntaxhighlight> function.
 
2. Define the binomial distribution with '''n''' trials and probability '''p''' of success using the <syntaxhighlight lang="Python" inline>binom()</syntaxhighlight> function.
  
Line 146: Line 151:
  
  
The author of this entry is Raj Chaudhari.
+
The author of this entry is Raj Chaudhari. Edited by Evgeniya Zakharova.
 
 
[[Category:Statistics]]
 
[[Category:Python basics]]
 

Latest revision as of 12:31, 3 September 2024

THIS ARTICLE IS STILL IN EDITING MODE

Definition

Binomial distribution is a probability distribution that summarises the likelihood that a variable will take one of two independent values under a given set of parameters. The distribution is obtained by performing a number of Bernoulli trials. Each trial is completely independent of all others.

A Bernoulli trial is assumed to meet each of these criteria:

  • there must be only 2 possible outcomes;
  • each outcome has a fixed probability of occurring. A success has the probability of p, and a failure has the probability of 1 – p.

Binomial Distribution is a discrete distribution, whcih describes the outcome of binary scenerios. Discrete distribution is defined at separate set of events, e.g., a coin toss's result is discrete as it can be only head or tail, while the people`s height is continuous, as it can be 170, 170.01, 170.11 and so on. Binomial distribution will be quite similar to normal distribution with certain loc and scale, if there are enough data points.

Binomial Distribution Formula

There is a Binomial distribution formula. With its help we can calculate the probability, that the event will occur k times in n number of trails.

Bern formula1.png

where p - a probability of success, 1 – p - a probability of failure, n - number of trails, k - number of successes and Bern form2.png

Example

Consider a random experiment of tossing a biased coin 6 times, where the probability of getting a head is 0.6. If "getting a head" is considered as "success" then, the binomial distribution table will contain the probability of r successes P(r) for each possible value of r. By substituting into the formula (1) (p=0.6, 1-p=0.4, n=6, k=(0,1,2,3,4,5,6)), we will get the following table of the binomial distribution:

r 0 1 2 3 4 5 6
P(r) 0.004096 0.036864 0.138240 0.276480 0.311040 0.186624 0.046656

Binomial distribution in Python

Now, we will use Python to analyse the distribution and plot the graph.

from numpy import random
x = random.binomial(n=21, p=0.5, size=21)
print(x)

[10 7 10 11 8 10 12 14 11 7 8 6 8 12 14 8 11 14 16 11 10]

random.binomial has three parameters:

  • n - number of trails;
  • p - probability of occurance of each trial;
  • size - the shape of the returned array.

It returns the array with number of success in n=21 trials. More information about this functions can be found here.

Visualization

import matplotlib.pyplot as plt
import seaborn as sns
sns.displot(random.binomial(n=10, p=0.5, size=100))
plt.show() # Figure 1

Bin distr plot 1.png
Figure 1: Binomial distribution plot

Python Calculation: Coin Flip Prediction

Using the math Library

We can implement the binomial distribution of getting an even number of heads in 21 flips of unbiased coins using the math library in Python. For each value of i in the range 0 to 22 (exclusive), with a step of 2, we can calculate the corresponding probability of getting i heads using the binomial distribution formula.

Inside the loop, we can first calculate the number of ways to choose i heads out of 21 flips using the math.comb() function. We can then calculate the probability of getting exactly i heads in 21 flips, which is (1/2)21. Finally, we can multiply the two values to obtain the probability of getting exactly i heads in 21 flips and store it in a list called probabilities.

After the loop finishes, we can compute the sum of all the probabilities, stored in the list probabilities using the sum() function and print the total probability of getting an even number of heads in 21 flips.

import math

# Create an empty list to store the probabilities of getting an even number of heads
probabilities = []

# Loop over all even numbers of heads that can be obtained in 21 flips
for i in range(0, 22, 2):
    # Calculate the number of ways to choose i heads out of 21 flips
    m = math.comb(21, i)
    # Calculate the probability of getting i heads in 21 flips
    n = (1/2 ** 21)
    prob = m * n 
    # Append the probability to the list
    probabilities.append(prob)

# Loop over the probabilities and print the probability of getting each even number of heads
for i, prob in enumerate(probabilities):
    print("The probability of getting {} heads in 21 flips is {}.".format(2*i, prob))

# Calculate the total probability 
total_prob = sum(probabilities)
print("Total probability of getting even number of heads is {}".format(total_prob))

The probability of getting 0 heads in 21 flips is 4.76837158203125e-07.
...
The probability of getting 10 heads in 21 flips is 0.16818809509277344.
...
Total probability of getting even number of heads is 0.5

Using the SciPy Library

This time, we will use the pre-built function to calculate the probability.

1. Import the binom() function from the scipy.stats module.
2. Define the binomial distribution with n trials and probability p of success using the binom() function.

We calculate the probability of getting an even number of heads (0, 2, 4, …, 20) using the pmf() method of the binomial distribution. It returns the probability mass function (PMF) of the distribution for each value in the given range.

Then we implement another for-loop to print the probability of getting each even number of heads in the list even_probs. Finally, we calculate the total probability of getting an even number of heads in 21 flips by summing up the probabilities in the list even_probs.

from scipy.stats import binom

n = 21
p = 0.5

# Define the binomial distribution with n trials and probability p of success
binom_dist = binom(n, p)

# Calculate the probability of getting an even number of heads (0, 2, 4, ..., 20)
even_probs = binom_dist.pmf(range(0, 22, 2))

# Print the probability of getting each even number of heads
for i, prob in enumerate(even_probs):
    print("The probability of getting {} heads in 21 flips is {}".format(2*i,prob))

# Calculate the total
total_prob = sum(even_probs)
print("Total probability of getting even number of heads is {}".format(total_prob))

The probability of getting 0 heads in 21 flips is 4.768371582031256e-07
...
The probability of getting 10 heads in 21 flips is 0.16818809509277355
...
Total probability of getting even number of heads is 0.5000000000000002

Scipy library provides a more efficient and convenient calculation method. It allows us to define and calculate the distribution in just a few lines of code.

Why are the Results (Slightly) Different?

The difference between the SciPy and math calculations is due to floating-point precision errors. In computer programming, floating-point numbers are represented in a limited number of bits, which can cause a loss of precision when performing calculations. This is why we are getting slightly different values for the probabilities of getting an even number of heads when comparing the results of the two methods.

Both methods provide a good approximation of the true probability, and the difference between the two is negligible for most practical purposes.


The author of this entry is Raj Chaudhari. Edited by Evgeniya Zakharova.