It’s been a while since I took my probability & statistic course. So why not review them and be ready for some more difficult courses?

Most of the definitions are copy-pasted from Wiki or some other reference resources. I only add some properties and comments

Probability space

In probability theory, a probability space or a probability triple $(\Omega ,{\mathcal {F}},P)$ is a mathematical construct that provides a formal model of a random process or “experiment”.

A probability space consists of three elements:

  • A sample space, $\Omega$, which is the set of all possible outcomes.
  • An event space, which is a set of events ${\mathcal {F}}$, an event being a set of outcomes in the sample space.
  • A probability function, which assigns each event in the event space a probability, which is a number between 0 and 1. $$P(\Omega)=1$$ $$P(A^C)=1-P(A)$$

Random Variable

A random variable is a function $X:\Omega\rightarrow S$ satisfying certain technical conditions.

It is a mapping or a function from possible outcomes in a sample space to a measurable space, often the real numbers.

For example, toss a coin. Sample space is ${\text{Heads},\text{Tail}}$, and $X$ can be 1 for heads and 0 for tails.

Representation of Distributions

In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment.

For instance, if X is used to denote the outcome of a coin toss (“the experiment”), then the probability distribution of X would take the value 0.5 (1 in 2 or 1/2) for X = heads, and 0.5 for X = tails (assuming that the coin is fair).

Let’s look at how we describe the distribution. Consider a $P(X\in E)$ where $E$ is a nice set.

For continuous random variables, we have probability density function $rho$ (PDF). $$P(X\in E) = \int_E \rho_X(x) dx$$

For discrete random variables, we have probability mass function $F_X$ (PMF). $$P(X\in E)=\sum_{x \in E} F_X(x)$$

For both types of RVs, we have CDF, the cumulative distribution function $F_X$. It gives $F_X(x)=P(X\leq x).$

Joint Distribution

Computing the expectation of functions of a random vector.

LLN, Law of Large Numbers

In probability theory, the law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials should be close to the expected value and tends to become closer to the expected value as more trials are performed.


CLT, Central Limit Theorem

In probability theory, the central limit theorem (CLT) establishes that, in many situations, when independent random variables are summed up, their properly normalized sum tends toward a normal distribution even if the original variables themselves are not normally distributed.

If ${\textstyle X_{1},X_{2},\dots ,X_{n},\dots }$ are random samples drawn from a population with overall mean ${\textstyle \mu }$ and finite variance ${\textstyle \sigma ^{2}}$, and if ${\textstyle {\bar {X}}_{n}}$ is the sample mean of the first ${\textstyle n}$ samples, then the limiting form of the distribution, ${\textstyle Z=\lim _{n\to \infty }{\left({\frac {{\bar {X}}_{n}-\mu }{\sigma _{\bar {X}}}}\right)}}$, with ${\displaystyle \sigma _{\bar {X}}=\sigma /{\sqrt {n}}}$, is a standard normal distribution.