mjtsai1974's Dev Blog Welcome to mjt's AI world

Introduction To The Poisson Distribution

Prologue To The Poisson Distribution

In probability theory and statistics, the Poisson distribution is the process developed with a hope to approximate the scenario of random arrival within a given time period. The random variable of the interarrival time modeled by the Poisson process is identical to the result deduced out by the exponential distribution, we could also express the interarrival time as kind of a special case of gamma distribution.

The Poisson Process Illustration

The Poisson process is a simple kind of random process, describing random points distribution in time or space. It is developedn based on be low two assumptions:
[1]homogeneity: it assumes the rate $\lambda$ of event occurrence is constant over time. The expected unmber of random point arrivals over time period $t$ is $\lambda\cdot t$.
[2]independence: it assumes all random occurrences are independent. This says that the number of arrivals over disjoint time intervals are independent random variables.

Next the illustration of the procerss.
➀suppose within a time interval $[0,t]$, the arrival of points or the occurrence of events are random and is represented by random variables $X_{1}$, $X_{2}$, $X_{3}$…, and this scenario is compliants with homogeneity and independence.
This article denotes the total number of occurrences within $[0,t]$ as $N([0,t])$, or just abbreviating $N_{t}$ for over time length $t$. The homogeneity implies that $E\lbrack N_{t}\rbrack$=$\lambda\cdot t$.
➁to be more precisely approximate to the distribution of such random arrivals, we divide the time period $t$ by $n$, to be believed that $n$ is large enough.
Then we have each distinct subinterval of time length $\frac {1}{n}$, and each subinterval would just have success of $1$ arrival, or failure of $0$ arrival, which itself is a Bernoulli distribution.
➂each subinterval has time length $\frac {t}{n}$, the $i$-th subinterval ranges from time $\frac {(i-1)\cdot t}{n}$ to $\frac {i\cdot t}{n}$. We take $R_{i}$ as the $i$-th event in each distinct subinterval. The Bernoulli random variable would have its outcome as $1$ for success and $0$ for failure in its distribution. So the expected value of the $i$-th arrival is:
$E\lbrack R_{i}\rbrack$=$1\cdot P_{i}$+$0\cdot F_{i}$, where $F_{i}$=$1$-$P_{i}$ for each $i$, and $P_{i}$=$\frac {\lambda\cdot t}{n}$.
➃we then accumulate all outcomes of all $R_{i}$, trivially, the total number of event occurrences remains the same.
$N_{t}$=$R_{1}$+$R_{2}$+…+$R_{i}$+…+$R_{n}$, the point is that each $R_{i}$ is independent random variable, now the original random process behaves in the Binomial distribution as a whole, therefor $N_{t}$ has Binomial distribution of Bin($n$,$p$), where $p$=$\frac {\lambda\cdot t}{n}$.
➄$C_{k}^{n}(\frac {\lambda\cdot t}{n})^{k}\cdot(1-\frac {\lambda\cdot t}{n})^{n-k}$ is the probability we have $k$ arrivals in the Binomial distribution, and the value of $n$ really matters. To get rid of this concern, we treat $n$ as infinity.
$\lim_{n\rightarrow\infty}C_{k}^{n}(\frac {\lambda\cdot t}{n})^{k}\cdot(1-\frac {\lambda\cdot t}{n})^{n-k}$
=$\lim_{n\rightarrow\infty}C_{k}^{n}(\frac {1}{n})^{k}\cdot(\lambda\cdot t)^{k}\cdot(1-\frac {\lambda\cdot t}{n})^{n-k}$
➅$\lim_{n\rightarrow\infty}C_{k}^{n}(\frac {1}{n})^{k}$
=$\lim_{n\rightarrow\infty}\frac {n}{n}\cdot\frac {n-1}{n}\cdot\frac {n-2}{n}\cdot\cdot\cdot\frac {n-k+1}{n}\cdot\frac {1}{k!}$
=$\frac {1}{k!}$…by using some algebra
➆$\lim_{n\rightarrow\infty}(1-\frac {\lambda\cdot t}{n})^{n-k}$
=$\lim_{n\rightarrow\infty}(1-\frac {\lambda\cdot t}{n})^{n}\cdot(1-\frac {\lambda\cdot t}{n})^{k}$
=$e^{-\lambda\cdot t}$
where by calculus, $\lim_{n\rightarrow\infty}(1-\frac {\lambda\cdot t}{n})^{n}$=$e^{-\lambda\cdot t}$,
and $\lim_{n\rightarrow\infty}(1-\frac {\lambda\cdot t}{n})^{k}$=$1$.
➇thus the probability we have $k$ random arrivals is deduced out below:

$P(N_{t}=k)$

=$\lim_{n\rightarrow\infty}C_{k}^{n}(\frac {\lambda\cdot t}{n})^{k}\cdot(1-\frac {\lambda\cdot t}{n})^{n-k}$
=$\lim_{n\rightarrow\infty}C_{k}^{n}(\frac {1}{n})^{k}\cdot(\lambda\cdot t)^{k}\cdot(1-\frac {\lambda\cdot t}{n})^{n-k}$
=$\frac {(\lambda\cdot t)^{k}}{k!}\cdot e^{-\lambda\cdot t}$

The Poisson Distribution Definition

By the illustration step ➇, we have $\frac {(\lambda\cdot t)^{k}}{k!}\cdot e^{-\lambda\cdot t}$ as the probability of $k$ random arrivals, we have below formal claim the definition of the Poisson distribution, as a result of the fact that $e^{-\lambda\cdot t}\cdot\sum_{k=0}^{\infty}\frac {(\lambda\cdot t)^{k}}{k!}$=$1$.

[Definition]

For any discrete random variable X with parameter $\mu$, it is said to have a Poisson distribution if its probability mass function is given by
$P(X=k)$=$\frac {(\mu)^{k}}{k!}\cdot e^{-\mu}$, for $k$=$0$,$1$,$2$,…, denote it as $Pois(\mu)$, where
➀$\mu$=$\lambda\cdot t$, and $\lambda$ is a constant event occurrence rate in the format of (event counts)/(time unit).
➁$t$ is the period of time in the unit with respect to the time unit of the rate by $\lambda$.

Please recall that we use the term probability mass function, since this random process is deducing from a rather discrete distributed case.

Expect Value And Variance Of The Poisson Distribution

Succeeding to above paragraphs, we know that probability for each distinct $R_{j}$ to have event occurrence is $\frac {\lambda\cdot t}{n}$, which is a great add-in and a short-cut to the expect value and variance.
[1]expect value
$E\lbrack N_{t}\rbrack$
=$E\lbrack\sum_{i=1}^{n}R_{i}\rbrack$
=$\sum_{i=1}^{n}E\lbrack R_{i}\rbrack$
=$n\cdot\frac {\lambda\cdot t}{n}$
=$\lambda\cdot t$…hold for $n\rightarrow\infty$
[2]variance::mjtsai
➀begin from the Binomial variance.
$Var\lbrack N_{t}\rbrack$
=$Var\lbrack\sum_{i=1}^{n}R_{i}\rbrack$
=$Var\lbrack R_{1}+R_{2}+…+R_{n}\rbrack$
=$Var\lbrack R_{1}\rbrack$+$Var\lbrack R_{2}\rbrack$+…+$Var\lbrack R_{n}\rbrack$
➁for each $i$,
$Var\lbrack R_{i}\rbrack$
=$E\lbrack R_{i}^{2}\rbrack$-$E^{2}\lbrack R_{i}\rbrack$
=$1^{2}\cdot p+0^{2}\cdot(1-p)$-$p^{2}$
=$p$-$p^{2}$, where $p$ is the success probability
=$p\cdot(1-p)$, take $p$=$\frac {\lambda\cdot t}{n}$
➂go to the Poisson case $n\rightarrow\infty$:
$\lim_{n\rightarrow\infty}Var\lbrack N_{t}\rbrack$
=$\lim_{n\rightarrow\infty}n\cdot p\cdot(1-p)$
=$\lim_{n\rightarrow\infty}n\cdot \frac {\lambda\cdot t}{n}\cdot(1-\frac {\lambda\cdot t}{n})$
=$\lambda\cdot t$, where $\lim_{n\rightarrow\infty}(1-\frac {\lambda\cdot t}{n})$=$1$

You can also see Poisson variance proof on WiKi. We found that the Poisson distribution has the same expect value and variance.

Example: Simple Poisson Probability Illustration

Given that $\lambda$ is the constant rate for the event occurrence over time, then the probability of one arrival within $[0,2s]$ would just be $P(N_{2s}=1)$=$\frac {\lambda\cdot 2s}{1!}\cdot e^{-\lambda\cdot 2s}$.
And one arrival within $[0,2s]$ is identical to the case that one arrival within $[0,s]$, zero arrival within $[s,2s]$ or the case that zero arrival within $[0,s]$, one arrival within $[s,2s]$. Therefore,
$P(N_{2s}=1)$
=$P(N_{[0,s]}=1,N_{[s,2s]}=0)$+$P(N_{[0,s]}=0,N_{[s,2s]}=1)$
=$(\frac {\lambda\cdot s}{1!}\cdot e^{-\lambda\cdot s})\cdot(\frac {(\lambda\cdot s)^{0}}{0!}\cdot e^{-\lambda\cdot s})$+$(\frac {(\lambda\cdot s)^{0}}{0!}\cdot e^{-\lambda\cdot s})\cdot(\frac {\lambda\cdot s}{1!}\cdot e^{-\lambda\cdot s})$
=$\frac {\lambda\cdot 2s}{1!}\cdot e^{-\lambda\cdot 2s}$

Addendum

➀A Modern Introduction to Probability and Statistics, published by Springer.