Table of Contents
- Introduction
- What Is a Statistical Distribution?
- Types of Distributions: Discrete vs Continuous
- Probability Mass Functions (PMFs)
- Probability Density Functions (PDFs)
- Cumulative Distribution Function (CDF)
- Bernoulli and Binomial Distributions
- Poisson Distribution
- Geometric and Negative Binomial Distributions
- Uniform Distribution (Discrete and Continuous)
- Normal (Gaussian) Distribution
- Exponential and Gamma Distributions
- Beta Distribution
- Chi-Square and t-Distributions
- Multivariate Distributions
- Applications in Science and Engineering
- Conclusion
1. Introduction
Statistical distributions are mathematical functions that describe the likelihood of different outcomes in a random experiment. They are foundational in probability theory, statistical inference, machine learning, and physical sciences.
2. What Is a Statistical Distribution?
A distribution assigns probabilities or densities to all possible outcomes of a random variable. It allows us to:
- Quantify uncertainty
- Model data behavior
- Infer population characteristics
3. Types of Distributions: Discrete vs Continuous
- Discrete distributions: finite or countable outcomes (e.g., dice rolls)
- Continuous distributions: uncountably infinite outcomes (e.g., time, height)
Each has its own mathematical description and properties.
4. Probability Mass Functions (PMFs)
Used for discrete random variables:
\[
P(X = x_i) = p_i, \quad \sum_i p_i = 1
\]
Examples: Bernoulli, Binomial, Poisson
5. Probability Density Functions (PDFs)
Used for continuous random variables:
\[
P(a \le X \le b) = \int_a^b f(x) dx, \quad \int_{-\infty}^\infty f(x) dx = 1
\]
Examples: Normal, Exponential, Gamma
6. Cumulative Distribution Function (CDF)
The CDF gives the probability that a variable takes a value less than or equal to \( x \):
\[
F(x) = P(X \le x)
\]
For continuous variables:
\[
F(x) = \int_{-\infty}^x f(t) dt
\]
7. Bernoulli and Binomial Distributions
- Bernoulli: \( X \in \{0, 1\} \), success probability \( p \)
- Binomial: \( n \) Bernoulli trials, success probability \( p \)
PMF of binomial:
\[
P(X = k) = \binom{n}{k} p^k (1 – p)^{n-k}
\]
8. Poisson Distribution
Models rare events over time/space:
\[
P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}
\]
Where \( \lambda \) is the average rate.
Used in:
- Queueing
- Radioactive decay
- Web traffic modeling
9. Geometric and Negative Binomial Distributions
- Geometric: trials until first success
- Negative Binomial: trials until \( r \)-th success
Geometric PMF:
\[
P(X = k) = (1 – p)^{k-1} p
\]
10. Uniform Distribution (Discrete and Continuous)
- Discrete: equal probability for each outcome
- Continuous: PDF is constant over interval:
\[
f(x) = \frac{1}{b – a}, \quad a \le x \le b
\]
11. Normal (Gaussian) Distribution
Central in probability theory:
\[
f(x) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left( -\frac{(x – \mu)^2}{2\sigma^2} \right)
\]
Characterized by:
- \( \mu \): mean
- \( \sigma^2 \): variance
Arises naturally via the central limit theorem.
12. Exponential and Gamma Distributions
- Exponential: time between events:
\[
f(x) = \lambda e^{-\lambda x}, \quad x \ge 0
\]
- Gamma: generalization with shape \( k \):
\[
f(x) = \frac{\lambda^k x^{k-1} e^{-\lambda x}}{\Gamma(k)}
\]
13. Beta Distribution
Used for modeling probabilities (bounded on [0,1]):
\[
f(x) = \frac{x^{\alpha – 1}(1 – x)^{\beta – 1}}{B(\alpha, \beta)}
\]
Where \( B \) is the beta function. Very flexible shape.
14. Chi-Square and t-Distributions
- Chi-square: sum of squared standard normals
Used in hypothesis testing and confidence intervals - t-distribution: accounts for small sample sizes; converges to normal as \( n \to \infty \)
15. Multivariate Distributions
Describes multiple random variables:
- Joint distribution: \( f(x, y) \)
- Multivariate normal: generalization of Gaussian to vector-valued random variables
Applications in:
- Machine learning
- Statistical physics
- Bayesian networks
16. Applications in Science and Engineering
- Physics: Maxwell-Boltzmann, Fermi-Dirac, Bose-Einstein
- Signal processing: noise modeling
- Biostatistics: survival analysis
- Finance: stock returns and risk models
- Machine learning: generative models, Naive Bayes
17. Conclusion
Statistical distributions are the mathematical backbone of modeling uncertainty. Mastering their properties and applications allows scientists and engineers to analyze data, simulate systems, and infer hidden patterns in the natural world.