It’s been a while since I took my probability & statistic course. So why not review them and be ready for some more difficult courses?

Most of the definitions are copy-pasted from Wiki or some other reference resources. I only add some properties and comments

Probability space

In probability theory, a probability space or a probability triple $(\Omega ,{\mathcal {F}},P)$ is a mathematical construct that provides a formal model of a random process or “experiment”.

A probability space consists of three elements:

  • A sample space, $\Omega$, which is the set of all possible outcomes.
  • An event space, which is a set of events ${\mathcal {F}}$, an event being a set of outcomes in the sample space.
  • A probability function, which assigns each event in the event space a probability, which is a number between 0 and 1. $$P(\Omega)=1$$ $$P(A^C)=1-P(A)$$

Random Variable

A random variable is a function $X:\Omega\rightarrow S$ satisfying certain technical conditions.

It is a mapping or a function from possible outcomes in a sample space to a measurable space, often the real numbers.

For example, toss a coin. Sample space is ${\text{Heads},\text{Tail}}$, and $X$ can be 1 for heads and 0 for tails.

Representation of Distributions

In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment.

For instance, if X is used to denote the outcome of a coin toss (“the experiment”), then the probability distribution of X would take the value 0.5 (1 in 2 or 1/2) for X = heads, and 0.5 for X = tails (assuming that the coin is fair).

Let’s look at how we describe the distribution. Consider a $P(X\in E)$ where $E$ is a nice set.

For continuous random variables, we have probability density function $rho$ (PDF). $$P(X\in E) = \int_E \rho_X(x) dx$$

For discrete random variables, we have probability mass function $F_X$ (PMF). $$P(X\in E)=\sum_{x \in E} F_X(x)$$

For both types of RVs, we have CDF, the cumulative distribution function $F_X$. It gives $F_X(x)=P(X\leq x).$

Joint Distribution

Computing the expectation of functions of a random vector.

LLN, Law of Large Numbers

In probability theory, the law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials should be close to the expected value and tends to become closer to the expected value as more trials are performed.

$$\lim_{n\rightarrow\infin}\sum^{n}_{i=1}\frac{X_i}{n}=\bar{X}$$

CLT, Central Limit Theorem

In probability theory, the central limit theorem (CLT) establishes that, in many situations, when independent random variables are summed up, their properly normalized sum tends toward a normal distribution even if the original variables themselves are not normally distributed.

If ${\textstyle X_{1},X_{2},\dots ,X_{n},\dots }$ are random samples drawn from a population with overall mean ${\textstyle \mu }$ and finite variance ${\textstyle \sigma ^{2}}$, and if ${\textstyle {\bar {X}}_{n}}$ is the sample mean of the first ${\textstyle n}$ samples, then the limiting form of the distribution, ${\textstyle Z=\lim _{n\to \infty }{\left({\frac {{\bar {X}}_{n}-\mu }{\sigma _{\bar {X}}}}\right)}}$, with ${\displaystyle \sigma _{\bar {X}}=\sigma /{\sqrt {n}}}$, is a standard normal distribution.