This part contains my notes for
- Hung-Yi Lee’s famous ML course
- ML basics & related math
- Notes for other things that I’d like to present
This part contains my notes for
This is a scribe of CS294 082 by Prof. Gerald Friedland from UC Berkeley The Idea of Function Counting Theorem First we start with Function Counting Theorem (Cover’s Theorem, Thomas M. Cover 1965). For example, we have a 2-dimensional space with 4 points. We have multiple ways to linearly separate these points: Look at $l_5$, it separates $x_1$, $x_4$ on the left side and $x_2$, $x_3$ on the right side....
Reference: Here, which is a well-written introduction to both concepts. Entropy “The entropy of a random variable is a function which attempts to characterize the “unpredictability” of a random variable.” The unpredictability is both related to the frequency and the number of outcomes. A fair 666-sided die is more unpredictable than 6-sided die. But if we cheat on 666-sided one by making the side with number 1 super heavy, we may then find the 666-sided die more predictable....
This is an additional but useful note. First recap the derivatives for scalars, for example: $\frac{dy}{dx} = nx^{n-1}$ for $y = x^n$. And we all know the rules for different kinds of functions/composed functions. Note that the derivative does not always exist. When we generalize derivatives to gradients, we are generalizing scalars vectors. In this case, the shape matters. scalar vector scalar $\frac{\partial y}{\partial x}$ $\frac{\partial y}{\partial \textbf{x}}$ scalar $\frac{\partial \textbf{y}}{\partial x}$ $\frac{\partial \textbf{y}}{\partial \textbf{x}}$ Case 1: y is scalar, x is vector $$x = [x_1,x_2,x_3,\cdots,x_n]^T$$ $$\frac{\partial y}{\partial \textbf{x}}=[\frac{\partial y}{\partial x_1},\frac{\partial y}{\partial x_2},\cdots,\frac{\partial y}{\partial x_n}]$$...
By Yinggan XU Dibbla This is generated by a previous courses (not included in Lee’s 2022 series), video can be found: RNN The RNN aims to deal with sequential inputs. We can first focus on the problem of slot filling: Time:______ Destination:_____ Here, the Time and Destination are the slots. We could like to automatically fill in the slots with given sentence: I would like to fly to Taipei on Nov 2nd....
By Yinggan XU Dibbla This is generated by a previous courses (not included in Lee’s 2022 series), video can be found: CNN The motivation is that we can of course use MLP to find a function such that we do image classification etc. However, it’s not necessary and not efficient due to the tremendous number of parameters. We are going to use the properties of images themselves. Before that, we need to know the structure of a picture....