The A to Z of Convolutional Neural Networks
52 days ago
Take your first steps towards Chatbot
52 days ago
Top 20 programming language of the year 2018.
Python is not a language for everyone. Why?
10 best Online Courses for Data Science
Learn the basic Linear Algebra and Math before going berserk on Machine Learning.
Most people will tell you that Linear Algebra is all about various forms of matrix factorizations. While that is true, the usual treatment is to simply teach you the recipe to factorize. This will help you implement the algorithm in code, but it will not teach you the mathematical essence of these factorizations. The purpose of factorization is to split a matrix into "simpler" matrices that have nice mathematical properties (diagonal, triangular, orthogonal, positive-definite etc.), and a general matrix (i.e. linear function) can be viewed as the composition of these "simpler" linear functions. The study of these "simple" linear functions forms the bulk of the analysis in linear algebra because if you have understood these simple functions (canonical matrices), then it's simply a matter of putting them together (function compositions) to conceptualize arbitrary linear transformations.
All great mathematicians will tell you that even the hardest, most abstract topic in Math requires geometric intuition. ED and SVD are probably the most used factorizations in Applied Math (and in real-life problems), but many a student has been frustrated by the opacity and dryness of the treatment in typical courses and books. Picturing them as rotations and stretches is the (in my opinion, only) way to go about understanding them. Eigenvectors are a basis of vectors (independent but not necessarily orthogonal) that the given matrix "purely stretches" (i.e., does not change their directions), and eigenvalues are the stretch quantities. This makes our life extremely easy but not all matrices can be ED-ed. But fear not - we have SVD that is more broadly usable albeit not as simple/nice as ED. SVD works on ANY rectangular matrix (ED works only for certain square matrices) and involves two different bases of vectors that both turn out to be orthogonal bases (orthogonality is of course very nice to have!). SVD basically tells us that the matrix simply sends one orthogonal basis to the other (modulo stretches), the stretch amounts known as singular values (appearing on the middle diagonal matrix). So an arbitrary matrix applied on an arbitrary vector will first rotate the vector (as given by the orientation of the first orthogonal basis), then stretch the components of the resultant vector (by the singular values), and finally rotate the resultant vector (as given by the orientation of the second orthogonal basis). There is also a nice connection between ED and SVD since SVD is simply ED on the product of a matrix and its transpose. There are some neat little animation tools out there that bring ED and SVD to life by vividly presenting the matrix operations as rotations/stretches. Do play with them while you learn this material.
People will tell you positive definite matrices (PDMs) are REALLY important but few can explain why they are important and few will go beyond the usual definition: v^T M v > 0 for every non-zero vector v. This definition, although accurate, confuses the hell out of students. The right way to understand PDMs is to interpret v^T M v as a "quadratic form", i.e., a function (of v) from R^n to R that is quadratic in the components of v, in every term in the function. Secondly, it's best to graph several examples of quadratic forms for n = 2 (i.e., viewed as a 3-D graph). PDMs are those matrices M for which this graph is a nice bowl-like shape, i.e., the valley is a unique point from which you can only go up. The alternatives are a "flat valley" where you can walk horizontally, or a "saddle valley" from where you can climb up in some directions or climb down in other directions. PDMs are desirable because they are simple and friendly to optimization methods. The other very nice thing about PDMs is that all of their eigenvalues are positive and their eigenvectors are orthogonal.