In this accelerated Machine Learning Course you will learn Tabular Data ML, Natural Language Processing (NLP) and Computer Vision.
Introduction lesson of Accelerated Tabular Data Course
This video introduces key Machine Learning concepts, goes over some Machine Learning applications, discusses Machine Learning life-cycle, and reviews useful Machine Learning terms. Supervised and unsupervised learning is discussed, with some classification, regression and clustering examples. A simple K Nearest Neighbors (KNN) model is build to handle a tabular data problem.
This video discusses Machine Learning model evaluation metrics, dataset splitting, along with underfitting and overfitting. For model evaluation, regression and classification specific metrics are discussed. Dataset splitting process is explained, allowing us to get training, validation, and test sets. We also explore using these datasets to find out whether the Machine Learning models might be underfitting or overfitting.
This lesson covers some basic Exploratory Data Analysis (EDA) tools and concepts. We learn how to use histograms, bar plots, scatter plots, correlation matrices, and understand the correlation concept. We also discuss techniques to preprocess imbalanced datasets and handle missing values.
This video introduces the K Nearest Neighbors (KNN) algorithm. We first understand how the KNN algorithm works. We then discuss how to choose the K parameter and use different distance metrics to raise performance. We also explain how curse of dimensionality affects the algorithm, and why feature scaling improves outcomes.
This video briefly introduces the topics to be discussed in the next few lessons
This lesson goes over topics covered in Lecture 2, building on the previous lecture, and learning more advanced Machine Learning topics.
This video discusses how to produce useful features from our raw data with Feature Engineering, with more details on handling two common types of data: Categorical and Text data. For categorical data, we explore ordinal encoding, one-hot encoding and target encoding. For text data, we use a text processing pipeline that involves text pre-processing/cleaning and vectorization.
This video first discusses the working principles of decision trees. We fit a simple decision tree on a sample dataset. We then introduce the impurity concept with Gini impurity score and explain the split condition selection process. We also learn how to build ensemble models using Bagging techniques.
This video discusses how to find a good combination of hyperparameters using different approaches. We cover grid search, randomized search and Bayesian search.
This lesson briefly introduces AWS SageMaker, a fully managed service that removes the heavy lifting from each step of the Machine Learning process: building, training, tuning, and deploying Machine Learning models.
This video goes over topics covered in Lecture 3, building on the previous lecture, and learning more advanced Machine Learning topics.
This Lesson starts reviewing the basics of optimization, which is key to training many Machine Learning models. We introduce the gradient concept and the gradient descent optimization technique. We then discuss Linear and Logistic Regression models, with focus on learning these models by gradient descent. We also discuss another key concept in Machine Learning: Regularization. With regularization we are able to build Machine Learning models that can generalize well on unseen datasets. We cover the L1, L2, and ElasticNet regularization.
This video discusses Boosting, an ensemble method to create strong models by building multiple weak models sequentially. We briefly cover: gradient boosting machines, XGBoost, LightGBM, and CatBoost.
This video covers one of the most important type of modern Machine Learning models: Neural Networks. We introduce neural networks using their similarity to regression models. We learn about the layered structure of neural networks, how data is passed with forward propagation, and how training works with back-propagation, leveraging open-source deep learning frameworks.
In this lesson, we go over the course overview, learning outcomes and machine learning resources that we will use in this class.
In this lesson, we make an introduction to Machine Learning (ML). We learn machine learning lifecycle that gives us a high level view of some important ML processes and cover some useful ML keywords.
We give some ML application examples in this lesson. We go over ranking, recommendation, classification, regression, clustering and anomaly detection applications and see some examples
In this lesson, we learn supervised and unsupervised learning. We then see more details on supervised learning with regression and classification example problems and also have an example for unsupervised learning.
We look at the class imbalance problem in this lesson and go over some methods to fix it.
In this video, we go over missing values topic which is a common issue in machine learning problems. We learn to solve this problem by dropping records or by using some imputation methods.
In this lesson, we learn model evaluation metrics, dataset splitting and overfitting-underfitting cases. For model evaluation, we consider regression and classification specific metrics. We then explain the dataset splitting process that allows us to get training, validation and test sets. We also talk about how to find out whether our ML models might be underfitting or overfitting.
We make introduction to NLP and learn these NLP terms: Corpus, Token and Feature vector in this lesson.
In this lesson, we learn a simple pipeline to process text information. Machine learning models need well defined numerical data. Our pipeline takes text data and applies pre-processing operations on it. Then, it converts it into numerical representation with the vectorization step. After that, we can use a simple machine learning model on the data.
We give more details about text pre-processing in this video. We learn tokenization, stop words removal and stemming-lemmatization methods. These methods will clean-up and normalize our text data.
In this lesson, we learn how to convert text data into numerical representation using Bag of Words method. This is an easy method that uses word counts/frequencies and gets numerical features from text data. We go over word counts, Term Frequency (TF) and Term Frequency - Inverse Document Frequency (TF-IDF).
We introduce the K Nearest Neighbors model in this video. We first understand how this model works. We then learn how to choose the K parameter and use different distance metrics. We also learn how curse of dimensionality affects this model and why feature scaling helps getting better outcomes.
In this lesson, we learn how to run Jupyter notebooks on Sagemaker. GitHub repositories
We understand the working principles of decision trees in this video. We fit a simple decision tree on a sample dataset. We then introduce the impurity concept with Gini impurity score and explain the split condition selection process. After decision tree discussion, we learn how to build ensemble models using Bagging approach.
In this section, we learn two highly popular machine learning models: Linear and Logistic regression. We understand the equations and cost functions of each model.
In this video, we learn the basics of optimization topic which is key to training many machine learning models. We start with the gradient concept and then introduce gradient descent optimization technique.
In this lesson, we learn another key concept in machine learning: Regularization. With regularization we are able to get ML models that can generalize well on test sets. We learn L1, L2 and ElasticNet regularizations and also see the Sklearn interfaces for them.
We learn how to find a good combination of hyperparameters using different approaches in this video. We cover grid search, randomized search and bayesian search.
In this video, we cover one of the most important modern machine learning models: Neural Network. We introduce neural networks using their similarity to the regression models that we learned. We learn the layered structure of neural networks, how data is passed with forward propagation and how training works. Then, we see more details on training process with cost functions and the gradient descent method.
In this lesson, we learn how to build embedding vectors for the words in our corpus. We also call these word vectors. With these vectors, we are able to capture semantic similarity between words. We use dot product to measure similarity between word vectors. We then see details of Skip-gram and Continuous Bag of Words (CBOW) methods and build a probabilistic model that gives word vectors.
In this video, we learn a new type of neural network: Recurrent Neural Network (RNN). RNNs are usually used to process sequential data. We learn the overall structure of RNNs and understand how they use internal hidden states to preserve and update sequential information.
In this lesson, we build on the idea of RNNs and introduce a new type of network. Gated Recurrent Units (GRUs) use internal gates to control hidden state updates. We introduce update and forget gates. Reset gate decides how much to remove from the previous hidden state. We also calculate a candidate hidden state. We use the update gate to decide how much we want to include this candidate in the overall hidden state.
Long Short Term Memory (LSTM) Networks use a similar approach like the Gated Recurrent Units (GRUs). In this video, we learn the internal pieces of LSTMs and understand how they work together to preserve and update sequential information.
In this lesson, we make an introduction to Transformers. We explain the key, value, query concepts using a linguistic approach.
In this lesson, we learn how to implement the query, key, value concepts using a neural network approach. We call this method "Single Headed Attention".
In this video, we introduce Multi Headed Attention concept and conclude Transformers topic. We learn that Multi-headed attention helps to cover a broader range of word meanings. Then, we briefly go over Transformer architecture and give example models.
In this video, we go over this course overview, learning outcomes and machine learning resources that we will use in this class.
In this video, we make an introduction to Machine Learning (ML). We learn machine learning lifecycle that gives us a high level view of some important ML processes and cover some useful ML terminology.
In this video, we give some commonly seen ML application examples. We go over ranking, recommendation, classification, regression, clustering and anomaly detection applications and see some examples.
In this video, we learn supervised and unsupervised learning. We then dive into more details of supervised learning with regression and classification problems, and a clustering example for unsupervised learning.
In this video, we learn some data processing methods. We first start with the data imbalance problem and go over some methods to solve it. We then dive into the details about image augmentation methods to produce some altered images. At the end, we learn how to split our dataset into training, validation and test subsets which is a critical process in ML.