Mathematics for Machine Learning - Day 1

Introduction and why I started

Today will mark the first day of many on my journey to not only use machine learning models, but understand it on a fundamental level. I've used machine learning for around a two years and yet, if you give me a pen and paper, I couldn't write down the formulas for my loss functions or my activation functions. After some reflection, I received this brutal but necessary insight from my large language model:

You've become complacent with superficial proficiency, relying on tools without understanding their core principles. This is the mindset of an amateur, not a professional. You’re walking on thin ice, ignorant of the depths below. This isn't sustainable.

So as a person with geophysics background, I'll always welcome more mathematics than geology (Sorry geologists) and with more and more improvements in artificial intelligence it's hard to stay rooted and understand on a deeper level how does it all actually work.

The Book Mathematics of Machine Learning described it best:

As machine learning becomes more ubiquitous and its software packages become easier to use, it is natural and desirable that the low-level technical details are abstracted away and hidden from the practitioner.

Pre-requisites:

To fully understand machine learning, there are three things to master:

Programming languages and data analysis tools
Large-scale computation and the associated frameworks
Mathematics and statistics and how machine learning builds on it.

Chapter 1 (Mathematical Foundations)

What is machine learning?
Machine learning is designing algorithms to automatically extract valuable information from data, with strong emphasis on automatic

Core of machine learning

There are three cores in machine learning, data, model, and learning.

Data

Machine learning is inherently data driven, to design general purpose methodologies to extract valuable information from data.

In this book, data is assumed to always be numeric so any qualitative data is processed, be it with scikit-learn's OneHotEncoder (or if I'm really lazy, a for loop with enumerate in it). So data is always seen as vectors.

What's a vectors?

An array of numbers (Computer Science)
An array with direction and magnitude (Physics)
An object that obeys addition and scaling (Mathematics)

Model

To extract valuable information from the data, we need to create structures/system in place that typically related to the process that generated said data. A good model can be seen as a simpler version of real (unknown) data generation.

Learning

The most crucial part: a model is said to learn from data when its performance on a given task improves after considering the data. Learning can be understood as a way to automatically find patterns and structures in data by optimizing the model's parameters.

Where's the mathematics?

The mathematics is found in the four pillars of machine learning.

Regression
Dimensionality Reduction
Density Estimation
Classification

These common classes in Python’s machine learning libraries will be my focus on this journey. With the foundational principles of these pillars explained in detail in this book, I will distill this information and continue learning not just the material, but also how to communicate it effectively.

Acknowledgement

I can't overstate this: I'm truly grateful for this book being open-sourced for everyone. Many people will be able to learn and understand machine learning on a fundamental level. Whether changing careers, demystifying AI, or just learning in general, this book offers immense value even for fledgling composer such as myself. So, Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong, thank you for this book.

Source:
Deisenroth, M. P., Faisal, A. A., & Ong, C. S. (2020). Mathematics for Machine Learning. Cambridge: Cambridge University Press.
https://mml-book.com

Blog