Poisson distribution formulas, equations, model, properties

4569
Jonah Lester

The Poisson distribution is a discrete probability distribution, through which it is possible to know the probability that, within a large sample size and during a certain interval, an event occurs whose probability is small.

Often times, the Poisson distribution can be used in place of the binomial distribution, as long as the following conditions are met: large sample and small probability.

Figure 1. Graph of the Poisson distribution for different parameters. Source: Wikimedia Commons.

Siméon-Denis Poisson (1781-1840) created this distribution that bears his name, very useful when it comes to unpredictable events. Poisson published his results in 1837, a research work on the probability of occurrence of erroneous criminal sentences.

Later, other researchers adapted the distribution in other areas, for example, the number of stars that could be found in a certain volume of space, or the probability that a soldier died from the kick of a horse..

Article index

  • 1 Formula and equations
  • 2 Model and properties
    • 2.1 Differences with the binomial distribution
  • 3 Examples
    • 3.1 Practical applications
    • 3.2 Approximating the binomial distribution with the Poisson distribution
  • 4 Solved exercises
    • 4.1 Exercise 1
    • 4.2 Exercise 2
  • 5 References

Formula and equations

The mathematical form of the Poisson distribution is as follows:

- μ (also sometimes denoted as λ) is the mean or parameter of the distribution

- Euler number: e = 2.71828

- The probability of obtaining y = k is P

- k is the number of successes 0, 1,2,3 ...

- n is the number of tests or events (the sample size)

Discrete random variables, as their name indicates, depend on chance and only take discrete values: 0, 1, 2, 3, 4 ..., k.

The mean of the distribution is given by:

The variance σ, which measures the spread of the data, is another important parameter. For the Poisson distribution it is:

σ = μ

Poisson determined that when n → ∞, and p → 0, the mean μ - also called expected value- tends to a constant:

μ → constant

Important: p is the probability of occurrence of the event taking into account the total population, while P (y) is the Poisson prediction on the sample.

Model and properties

The Poisson distribution has the following properties:

-The sample size is large: n → ∞.

-The events or events considered are independent of each other and occur randomly.

-Probability P that certain event Y occurs during a specific period of time is very small: P → 0.

-The probability that more than one event occurs in the time interval is 0.

-The average value approximates a constant given by: μ = n.p (n is the sample size)

-Since the dispersion σ is equal to μ, as it adopts larger values, the variability also becomes greater.

-The events must be evenly distributed in the time interval used.

-The set of possible event values Y is: 0,1,2,3,4 ... .

-The sum of i variables that follow a Poisson distribution, is also another Poisson variable. Its average value is the sum of the average values ​​of these variables.

Differences with the binomial distribution

The Poisson distribution differs from the binomial distribution in the following important ways:

-The binomial distribution is affected by both the sample size n and the probability P, but the Poisson distribution is only affected by the mean μ.

-In a binomial distribution, the possible values ​​of the random variable Y are 0,1,2,…, N, on the other hand in the Poisson distribution there is no upper limit for these values.

Examples

Poisson initially applied his famous distribution to legal cases, but on an industrial level, one of his earliest uses was in brewing beer. In this process yeast cultures are used for fermentation.

Yeast consists of living cells, the population of which is variable over time. In the manufacture of beer it is necessary to add the necessary amount, therefore it is necessary to know the number of cells per unit of volume.

During World War II the Poisson distribution was used to find out if the Germans were actually aiming at London from Calais, or just firing at random. This was important for the Allies to determine how good the technology was available to the Nazis..

Practical applications

The applications of the Poisson distribution always refer to counts in time or counts in space. And since the probability of occurrence is small, it is also known as the "law of rare events".

Here is a list of events that fall into one of these categories:

-Record of the particles in a radioactive decay, which like the growth of yeast cells, is an exponential function.

-Number of visits to a certain website.

-Arrival of people in a line to pay or be attended (queue theory).

-Number of cars that pass a certain point on a road, during a given time interval.

Figure 2. The number of cars passing through a point roughly follows a Poisson distribution. Source: Pixabay.

-Mutations in a certain DNA strand after receiving radiation exposure.

-Number of meteorites with a diameter greater than 1 m fallen in a year.

-Defects per square meter of fabric.

-Number of blood cells in 1 cubic centimeter.

-Calls per minute to a telephone exchange.

-Chocolate chips present in 1 kg of cake batter.

-Number of trees infected by a certain parasite in 1 hectare of forest.

Note that these random variables represent the number of times an event occurs during a fixed period of time (calls per minute to the telephone exchange), or a given region of space (defects of a fabric per square meter).

These events, as has already been established, are independent of the time that has passed since the last occurrence..

Approximating the binomial distribution with the Poisson distribution

The Poisson distribution is a good approximation to the binomial distribution as long as:

-The sample size is large: n ≥ 100

-Probability p is little: p ≤ 0.1

- μ is in the order of: np ≤ 10

In such cases the Poisson distribution is an excellent tool, since the binomial distribution can be difficult to apply in these cases..

Solved exercises

Exercise 1

A seismological study determined that during the last 100 years, there were 93 large earthquakes around the world, of at least 6.0 on the Richter scale -logarithmic-. Suppose that the Poisson distribution is a suitable model in this case. Find:

a) The average occurrence of large earthquakes per year.

b) Yes P (y) is the probability of occurrence Y earthquakes during a randomly selected year, find the following probabilities:

P(0), P(1), P (two), P (3), P (4), P (5), P (6) and P (7).

c) The true results of the study are the following:

- 47 years (0 earthquakes)

- 31 years (1 earthquakes)

- 13 years (2 earthquakes)

- 5 years (3 earthquakes)

- 2 years (4 earthquakes)

-  0 years (5 earthquakes)

- 1 years (6 earthquakes)

- 1 years (7 earthquakes)

How do these results compare with those obtained in part b? Is the Poisson distribution a good choice to model these events?

Solution to)

a) Earthquakes are events whose probability p it is small and we are considering a restricted period of time, of one year. The average number of earthquakes is:

μ = 93/100 earthquakes / year = 0.93 earthquakes per year.

Solution b)

b) To calculate the requested probabilities, values ​​are substituted in the formula given at the beginning:

y = 2

μ = 0.93

e = 2.71828

It is quite less than P (2).

The results are listed below:

P (0) = 0.395, P (1) = 0.367, P (2) = 0.171, P (3) = 0.0529, P (4) = 0.0123, P (5) = 0.00229, P (6) = 0.000355, P (7) = 0.0000471.

For example, we could say that there is a 39.5% probability that no major earthquake will occur in a given year. Or that there are 5.29% of 3 large earthquakes occurring in that year.

Solution c)

c) The frequencies are analyzed, multiplying by n = 100 years:

39.5; 36.7; 17.1; 5.29; 1.23; 0.229; 0.0355 and 0.00471.

For example:

- A frequency of 39.5 indicates that 0 large earthquakes occur in 39.5 out of 100 years, we could say that it is quite close to the actual result of 47 years without any major earthquake..

Let's compare another Poisson result with the actual results:

- The obtained value of 36.7 means that in a period of 37 years there is 1 great earthquake. The actual result is that in 31 years there was 1 major earthquake, a good match with the model.

- 17.1 years are expected with 2 large earthquakes and it is known that in 13 years, which is a close value, there were in fact 2 large earthquakes.

Therefore the Poisson model is acceptable for this case.

Exercise 2

One company estimates that the number of components that fail before reaching 100 operating hours follows a Poisson distribution. If the average number of failures is 8 in that time, find the following probabilities:

a) A component fails in 25 hours.

b) Failure of less than two components, in 50 hours.

c) Failure of at least three components in 125 hours.

Solution to)

a) It is known that the average of failures in 100 hours is 8, therefore in 25 hours a quarter of failures is expected, that is, 2 failures. This will be the parameter μ.

The probability that 1 component fails is requested, the random variable is "components that fail before 25 hours" and its value is y = 1. By substituting in the probability function:

However, the question is how likely they are to fail less than two components in 50 hours, not that exactly 2 components fail in 50 hours, therefore we must add the probabilities that:

-None fail

-Fail only 1

P (less than 2 components fail) = P (0) + P (1)

P (less than 2 components fail) = 0.0183 + 0.0732 = 0.0915

c) That they fail at least 3 components in 125 hours, it means that 3, 4, 5 or more can fail in that time.

The probability that it occurs at least one of several events is equal to 1, minus the probability that none of the events will occur.

-The desired event is that 3 or more components fail in 125 hours

-That the event does not occur means that less than 3 components fail, whose probability is: P (0) + P (1) + P (2)

The parameter μ of the distribution in this case is:

 μ = 8 + 2 = 10 failures in 125 hours.

P (3 or more components fail) = 1- P (0) - P (1) - P (2) =

References

  1. MathWorks. Poisson distribution. Recovered from: es.mathworks.com
  2. Mendenhall, W. 1981. Statistics for Management and Economics. 3rd. edition. Grupo Editorial Iberoamérica.
  3. Stat Trek. Teach yourself Statistics. Poisson Distribution. Recovered from: stattrek.com,
  4. Triola, M. 2012. Elementary Statistics. 11th. Ed. Pearson Education.
  5. Wikipedia. Poisson distribution. Recovered from: en.wikipedia.org

Yet No Comments