Amazon Machine Learning| ML Key Concepts
Kareem Negm
Posted on November 24, 2021
Amazon Machine Learning Key Concepts
Data sources
Term | Definition |
---|---|
Attribute | A unique, named property within an observation. In tabular-formatted data such as spreadsheets or CSV files |
Datasource Name | A unique name for a dataset |
Input Data | Collective name for all the observations that are referred to by a datasource. |
Location | Amazon ML can use data that is stored within Amazon S3 buckets, Amazon Redshift databases, or MySQL databases in Amazon Relational Database Service (RDS) |
Observation | A single data point that is part of a datasource |
Schema | The information needed to interpret the input data, including attribute names and their assigned data types, and names of special attributes. |
Statistics | Summary statistics for each attribute in the input data |
Status | Indicates the current state of the datasource, such as In Progress, Completed, or Failed. |
Target Attribute | the target attribute is the attribute whose value will be predicted by a trained ML model |
ML Models
Term | Definition |
---|---|
Regression | ML model to predict a numeric value |
Multiclass | ML model to predict values that belong to a limited, pre-defined set of permissible values. |
Binary | ML model to predict values that can only have one of two state |
Model Size | ML models capture and store patterns. The more patterns a ML model stores, the bigger it will be. ML model size is described in Mbytes. |
Number of Passes | he number of times that you let Amazon ML use the same data records is called the number of passes. |
Regularization | Regularization is a machine learning technique that you can use to obtain higher-quality models |
Evaluations
Term | Definition |
---|---|
Model Insights | Amazon ML provides you with a metric to evaluate the predictive performance of your model. |
Precision | the number of positive class predictions that actually belong to the positive class. |
Recall | the number of positive class predictions made out of all positive examples in the dataset. |
AUC | Area Under the ROC Curve (AUC) measures the ability of a binary ML model to predict a higher score for positive examples as compared to negative examples |
Accuracy | Accuracy measures the percentage of correct predictions. |
F1-score | The macro-averaged F1-score is used to evaluate the predictive performance of multiclass ML models. |
RMSE | The Root Mean Square Error (RMSE) is a metric used to evaluate the predictive performance of regression ML models. |
Cut-off | The cut-off is the threshold that you use to determine whether a predicted value is correct or not. |
Batch Predictions
Term | Definition |
---|---|
Output Location | The results of a batch prediction are stored in an S3 bucket output location. |
Manifest File | This file relates each input data file with its associated batch prediction results. It is stored in the S3 bucket output location. |
Real-time Predictions
Real-time predictions are for applications with a low latency requirement, such as interactive web, mobile, or desktop applications.
Term | Definition |
---|---|
Real-time Prediction API | The Real-time Prediction API accepts a single input observation in the request payload and returns the prediction in the response. |
Real-time Prediction Endpoint | To use an ML model with the real-time prediction API, you need to create a real-time prediction endpoint. Once created, the endpoint contains the URL that you can use to request real-time predictions. |
AWS WhitePaper Summary
💖 💪 🙅 🚩
Kareem Negm
Posted on November 24, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.