Read Along: Probabilistic Machine Learning, An Introduction by Kevin P. Murphy (1.2.1.1 - 1.2.1.2)
alice
Posted on December 30, 2023
In this series of blog In this blog series, I'm summarizing and discussing "Probabilistic Machine Learning: An Introduction" by Kevin P. Murphy, complemented with my examples to aid understanding and retention.
1.2.1.1 Classification
Image classification presents challenges due to its high dimensionality, where each pixel represents a feature. This complexity increases in color images, as each pixel comprises multiple features based on the RGB (Red, Green, Blue) channels, while in grayscale images, each pixel represents a single intensity feature.
Regarding dimensions in image classification, we often see the formula D = C x D1 x D2.
CNNs are pivotal here. Convolutional Neural Networks adeptly identify and learn hierarchical image patterns, making them essential for tasks like image classification and object recognition.
An example of a design matrix is the Iris dataset, represented as N x D, where N is the number of examples and D is the number of features, exemplifying tabular data.
Big data is characterized by N > D, while wide data, where D > N, often leads to overfitting. Overfitting occurs when a model excessively learns from the training data, including noise, hindering its generalizability. Wide data typically involves detailed, granular information.
Featurization is the process of transforming complex, nonlinear data into linear features suitable for machine learning.
1.2.1.2 Exploratory Data Analysis (EDA)
EDA is a crucial preliminary step involving the screening of raw data for evident patterns and issues before applying complex models.
For low-dimensional data, pair plots are common. These visual tools reveal pairwise relationships within a dataset, showcasing both individual variable distributions and inter-variable correlations, aiding in pattern and correlation exploration.
In high-dimensional data scenarios, dimensionality reduction is often a preliminary step.
Posted on December 30, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.