Statistical Distributions: Describing Data and Random Processes

Table of Contents

  1. Introduction
  2. What Is a Statistical Distribution?
  3. Types of Distributions: Discrete vs Continuous
  4. Probability Mass Functions (PMFs)
  5. Probability Density Functions (PDFs)
  6. Cumulative Distribution Function (CDF)
  7. Bernoulli and Binomial Distributions
  8. Poisson Distribution
  9. Geometric and Negative Binomial Distributions
  10. Uniform Distribution (Discrete and Continuous)
  11. Normal (Gaussian) Distribution
  12. Exponential and Gamma Distributions
  13. Beta Distribution
  14. Chi-Square and t-Distributions
  15. Multivariate Distributions
  16. Applications in Science and Engineering
  17. Conclusion

1. Introduction

Statistical distributions are mathematical functions that describe the likelihood of different outcomes in a random experiment. They are foundational in probability theory, statistical inference, machine learning, and physical sciences.


2. What Is a Statistical Distribution?

A distribution assigns probabilities or densities to all possible outcomes of a random variable. It allows us to:

  • Quantify uncertainty
  • Model data behavior
  • Infer population characteristics

3. Types of Distributions: Discrete vs Continuous

  • Discrete distributions: finite or countable outcomes (e.g., dice rolls)
  • Continuous distributions: uncountably infinite outcomes (e.g., time, height)

Each has its own mathematical description and properties.


4. Probability Mass Functions (PMFs)

Used for discrete random variables:

\[
P(X = x_i) = p_i, \quad \sum_i p_i = 1
\]

Examples: Bernoulli, Binomial, Poisson


5. Probability Density Functions (PDFs)

Used for continuous random variables:

\[
P(a \le X \le b) = \int_a^b f(x) dx, \quad \int_{-\infty}^\infty f(x) dx = 1
\]

Examples: Normal, Exponential, Gamma


6. Cumulative Distribution Function (CDF)

The CDF gives the probability that a variable takes a value less than or equal to \( x \):

\[
F(x) = P(X \le x)
\]

For continuous variables:
\[
F(x) = \int_{-\infty}^x f(t) dt
\]


7. Bernoulli and Binomial Distributions

  • Bernoulli: \( X \in \{0, 1\} \), success probability \( p \)
  • Binomial: \( n \) Bernoulli trials, success probability \( p \)

PMF of binomial:

\[
P(X = k) = \binom{n}{k} p^k (1 – p)^{n-k}
\]


8. Poisson Distribution

Models rare events over time/space:

\[
P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}
\]

Where \( \lambda \) is the average rate.

Used in:

  • Queueing
  • Radioactive decay
  • Web traffic modeling

9. Geometric and Negative Binomial Distributions

  • Geometric: trials until first success
  • Negative Binomial: trials until \( r \)-th success

Geometric PMF:

\[
P(X = k) = (1 – p)^{k-1} p
\]


10. Uniform Distribution (Discrete and Continuous)

  • Discrete: equal probability for each outcome
  • Continuous: PDF is constant over interval:

\[
f(x) = \frac{1}{b – a}, \quad a \le x \le b
\]


11. Normal (Gaussian) Distribution

Central in probability theory:

\[
f(x) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left( -\frac{(x – \mu)^2}{2\sigma^2} \right)
\]

Characterized by:

  • \( \mu \): mean
  • \( \sigma^2 \): variance

Arises naturally via the central limit theorem.


12. Exponential and Gamma Distributions

  • Exponential: time between events:

\[
f(x) = \lambda e^{-\lambda x}, \quad x \ge 0
\]

  • Gamma: generalization with shape \( k \):

\[
f(x) = \frac{\lambda^k x^{k-1} e^{-\lambda x}}{\Gamma(k)}
\]


13. Beta Distribution

Used for modeling probabilities (bounded on [0,1]):

\[
f(x) = \frac{x^{\alpha – 1}(1 – x)^{\beta – 1}}{B(\alpha, \beta)}
\]

Where \( B \) is the beta function. Very flexible shape.


14. Chi-Square and t-Distributions

  • Chi-square: sum of squared standard normals
    Used in hypothesis testing and confidence intervals
  • t-distribution: accounts for small sample sizes; converges to normal as \( n \to \infty \)

15. Multivariate Distributions

Describes multiple random variables:

  • Joint distribution: \( f(x, y) \)
  • Multivariate normal: generalization of Gaussian to vector-valued random variables

Applications in:

  • Machine learning
  • Statistical physics
  • Bayesian networks

16. Applications in Science and Engineering

  • Physics: Maxwell-Boltzmann, Fermi-Dirac, Bose-Einstein
  • Signal processing: noise modeling
  • Biostatistics: survival analysis
  • Finance: stock returns and risk models
  • Machine learning: generative models, Naive Bayes

17. Conclusion

Statistical distributions are the mathematical backbone of modeling uncertainty. Mastering their properties and applications allows scientists and engineers to analyze data, simulate systems, and infer hidden patterns in the natural world.


.