Beginners Machine Learning Theory in 1 Hour

Learn machine learning and artificial intelligence from scratch.

Learn about what Machine Learning is and what it can do

Who Should take this Course?

- Those who have little to no knowledge of machine learning

- Those who want to know more about machine learning - Ideally, those with some knowledge of programming principles

What will we Learn?

- What is machine learning?

- What can we do with machine learning?

- What types of machine learning are there? - This course is entirely theoretical and will be explored only in the slides

Why Learn ML and AI?

- Machine learning allows us to build more powerful, more accurate, and more user friendly software

- Machine learning programs and respond and adapt and are often more effective than manual solutions - Many companies are integrating machine learning or have already done so

- There are many high-paying machine learning jobs out there

- Fun and exciting topic

Quick Intro to Machine Learning

Get a quick overview of what machine learning is

What will we Cover?

- What is machine learning? - Machine learning vs AI

- Where is machine learning used? - How does machine learning work?

What is Machine Learning?

- Machine learning describes software that uses previously-seen data to find patterns and perform some task unknown to it

- Generally, the software starts out performing poorly and then improves over time as it runs

- The improvements are the result of an algorithm changing values by itself rather than a programmer manually tweaking the algorithm

Machine Learning vs AI

- Machine learning is only about using data and past experiences to perform a specific task

- AI is machines using deduction, logic, and reasoning to perform a task or tasks. Powerful AIs can make intelligent decisions based on human-like “thinking”

- Machine learning is a subset of AI and almost all sophisticated AIs employ some form of machine learning

Where we use Machine Learning

- Image recognition

- Self driving cars

- Speech detection

- Prediction (weather, stocks, text)

- Text summarization

- Language translation

- Classification/categorization

- Adaptive gaming

- AI

How Machine Learning Works

1. Figure out what problem we’re trying to solve and how we can convert the inputs and outputs into numbers 2. Gather and format data for training and testing 3. Build a model that can take an input, run a series of operations, and produce an output in the correct format 4. Train the model

5. Test and evaluate the model

6. Tweak the model if needs be

Deep Diving into Machine Learning

Thoroughly explore what machine learning is and how it works

What will we Cover?

- What is machine learning?

- Problems machine learning can solve - Types of machine learning

- How machine learning works

- Common machine learning structures - Steps to building a machine learning program

What is Machine Learning?

- Technically, machine learning is the study of how machines use statistics and algorithms to infer patterns in data to perform some task implicitly

- Practically, machine learning software uses pattern recognition to analyze data and come to conclusions without us programming in how to arrive at those conclusions

- The key takeaway is that the software is not given explicit instructions about how to perform the task properly

Why the Learning Aspect?

- Generally, a machine learning model starts out performing relatively poorly but improves over time, hence the name machine learning

- If a model produces better results moving forwards than in the past, it is said to have “learned”

- Improvements could be more accurate predictions, better suggestions, more capable AI, etc., generally performing its task better

How does Learning Work?

- Theoretically, the more data a model is exposed to and the more it runs, the better the results

- The improvements come from the model making changes to internal values to more accurately map inputs to outputs, not from tweaking the algorithm itself

- By exposing a model to varied data and different kinds of inputs, it learns how to respond to many scenarios - This is done during the training phase

Problems Solved with Machine Learning

Learn how machine learning solves real world problems

What can Machine Learning do?

- Machine learning draws conclusions through pattern recognition

- With training, models learn to produce specific outputs based on specific inputs or environments

- These problems can all be solved via pattern recognition and, thus, machine learning:

- Prediction

- Classification

- Customization

- Translation

- Game or other AI


- Prediction is guessing some future outcome, event, or value

- When forming predictions, we often operate under the assumption that history repeats itself

- We examine past data to try to determine patterns that match the current data and assume the result will be similar to what happened in the past

Prediction Models

- Prediction models are trained on varied data to get exposure to many inputs and outputs

- Models learn to map various inputs to outputs so theoretically, if it sees similar inputs in the future, it knows roughly what to output

- We choose which features have an effect on the predicted outcome and quantify them to feed into the model

Prediction Examples

- Stock market prediction - if we see repeated patterns of price movement, we know when to buy and sell stocks to maximize profits

- Weather prediction - if know from past data that certain factors produce a specific weather pattern, we will get similar weather if we see those factors again

Prediction Example


- Classification is the grouping of inputs into classes based on presence/absence of predetermined features - We first decide into which classes we want to divide the inputs and then determine which features are important for determining membership

- Sometimes it can be difficult to quantify features or to determine which features matter and which don’t. It’s easy to overlook seemingly extraneous details

Classification Models

- Classification models are typically fed equal numbers of inputs from each category and learns to pick out and assign importance to the features

- Models then use the combination of features to determine probabilities of each input belonging to each category. It’s almost never 100% certain of membership

- Models don’t perform well on inputs that don’t fit into one of the predetermined classes as they try to fit every input into one of these categories

Classification Examples

- Image recognition/classification - if we’ve seen many images of items belonging to one of n categories, we can class any new images into of these categories based on presence/absence of physical characteristics (for us) or clusters of pixel values (for machines)

- Plant classification - we can use features such as petal shape/size, colouration, leaf shape/size, etc. to determine which species a plant belongs to

Classification Example


- Customization is providing an experience tailored towards a user based on their preferences

- We use previous decisions and habits to make suggestions to the user that we think best match their needs and wants

- Not so different from prediction in that we are studying past data to determine how to best service users in the present and future

Customization Models

- Customization models must take in a combination of past data and current user preferences as well as data from users in similar demographics (age, race, gender, etc)

- Models are only really good at performing one or two tasks so we often combine various customization models to provide a complete experience

- Models often combine aspects of prediction and classification

Customization Examples

- Targeted Ads - we can determine what products a user might be interested in buying based on their previous purchases, demographic membership, or even what the conversations they are having

- Text prediction - we can automatically formulate responses to emails, texts, etc. based on how the user and people in similar demographics might respond


- Translation at its base is taking input in one form and producing the equivalent output in another

- Typically, this means changing input from one language to another without loss of meaning

- This is simple when there is a one to one translation or with very similar languages like Spanish to Italian - This is very difficult when the structure is very different such as with English to Mandarin

Translation Models

- The tricky part is translating from one form to another with no loss of meaning

- Translation models, therefore, typically use an intermediate vector representation of input and an encoder-decoder structure

- An encoder extracts the meaning of the input and converts that to the intermediate vector

- The decoder then converts the intermediate vector to the output format without loss of meaning

Translation Examples

- Language Translation - language translation is rarely a 1:1 translation; when converting a sentence from one language to another, we need to break it down into its parts and meaning and the reformulate the sentence in the structure of the target language

- Speech-to-text - similar to language translation but instead of converting between languages, we are converting data from one format to another

Game AI

- A game AI makes in-game decisions and responds based on deduction and reasoning

- Good AIs follow a set of priority-labelled instructions based on current state rather than acting randomly - Sophisticated AIs will respond, adapt, and change strategy in real time based on what they have been exposed to in the past, determining patterns of behaviour, and the current environment

Game AI Models

- Game AI models are fed an environment, state, series of actions, and a reward system with the goal being getting the highest possible reward or score

- Typically, game AIs are not shown how to win a game or even how to determine a winning strategy

- Instead, models are shown how actions increase or decrease reward and during training, through trial and error, they determine what steps to take to maximize reward

Game AI Examples

- Chess AIs - chess models play many games until they learn, given each state, what the best moves would be. Good moves are those that get them closer to winning

- Runner AIs - runner AIs (ex. Super Mario) play many games until they learn when to optimally jump, spin, move forwards, backwards, etc. Good moves are those that help them avoid obstacles and get to the end the fastest without dying

Types of Machine Learning

Learn about the different kinds of machine learning out there

Types of Machine Learning

- 3 main categories:

- Supervised

- Unsupervised

- Reinforcement

- All three types use some form of pattern recognition to solve a problem

- Each category contains different kinds of algorithms - Supervised and reinforcement employ different kinds of training and unsupervised does not use training

Supervised Learning

- Describes models trained by feeding in inputs and the correct outputs (labels) and allowing the model to adjust values in order to map the correct pathways

- Clear end goal with correct and incorrect answers for inputs such as categorizing them or outputting a final number

- Essentially, we show the model what the correct answer is for each input and the model tries to match it so that in the future, it can see similar inputs and know what to do

Supervised Learning Usage

- Most common of the three types of machine learning and the most beginner-friendly

- Most classification and regression type problems fall under supervised learning

- Specific algorithms:

- Regression (linear, non-linear, logistic)

- Decision tree

- K-nearest-neighbours

Unsupervised Learning

- Describes models that run with little to no human guidance during running; we just input values and see what happens

- Technically no training required as there is no clear end goal or correct or incorrect answers

- Models are used to cluster or group data by detecting potentially unknown patterns

Unsupervised Learning Usage

- Least common of the three types of machine learning - Typically used in clustering algorithms or data compression as there is no clear correct output - Specific algorithms:

- Apriori

- K-means

Reinforcement Learning

- Describes algorithms that improve based on a reward system during training

- There isn’t technically a right or wrong answer to show the model; instead, we show the model what actions will result in a positive reward and vice versa

- During training, the model learns through trial and error how to achieve the maximum possible reward based on environment, current state, and possible actions

Reinforcement Learning Usage

- Second most common of the three types of machine learning and perhaps the most complex

- Often see this in game AIs and any model that employs a behaviour-reward-punishment system

- Specific algorithms:

- Markov decision process

How Machine Learning Works

Learn how to structure models and how the machine learning process works

What will we Cover?

- Definitions and background

- Common machine learning structures - Steps to building a machine learning model

Neural Network and Neurons

- Neural network - computational graph, network of interconnected nodes/neurons through which data flows and is modified. Data enters via input nodes, travels through some number of intermediate or hidden nodes and exits via output nodes

- Neuron - node, single point in a neural network, that receives input and produces output. Organized into layers. Intermediate nodes typically have weight, bias, and activation function and modify the input

Neural Network and Neurons

Weight, Bias, and Weighted Sum

- Weights - factor by which we multiply inputs of a neuron. Each input into a neuron has a potentially different weight. The model changes these values during training

- Weighted sum - sum of products of weights and inputs. Multiply each input by its corresponding weight and sum the results

- Bias - value added to the weighted sum, used to change the neurons output

Weight, Bias, and Weighted Sum

Activation Function

- Activation function - function applied to the weighted sum + bias to transform a neuron’s output. Determines whether or not a neuron will fire

- If a neuron “fires”, it passes on its output to the next neuron

- Some neurons do not produce a sufficiently strong output to have an effect on the final model output

Activation Functions

- Sigmoid - Inputs produce outputs between 0 and 1 - TanH - Inputs produce outputs between -1 and 1 - ReLu - Inputs produce outputs between 0 and the input, used most often

- Softmax - Typically used in the final layer of classification models as it creates a probability distribution

Common Machine Learning Structures

Learn some of the ways we build machine learning models

Common Structures

- Feed forward neural network - Radial basis function neural network - Recurrent neural network

- Modular sequence neural network - Sequence to sequence models

Single Layer Feed Forward

- Often called a Perceptron

- Input layer, one intermediary layer (with weights, biases, and activation function), and output layer

- Supervised learning

- Typically trained via the delta-rule algorithm, a form of gradient descent. After each input batch, calculate difference between expected and actual outputs and adjust weights to minimize this difference

- Examples: image recognition, classification

Multi Layer Feed Forward

- Same as single layer but multiple intermediary layers - Every node in a layer is connected to every other node in the previous and next layers

- Supervised learning

- Typically trained via back-propagation. After each input batch, calculate difference between expected and actual outputs and adjust weights to minimize this difference - Examples: image recognition, classification

Radial Basis Function (RBF)

- Similar to perceptron but intermediary nodes have radial basis activation functions

- Radial basis functions are Gaussian (neurons fire maximally when distance between weights is smallest - Excellent at detecting anomalies so good at finding outliers in classification problems

- Supervised learning

Convolutional Neural Network

- Similar to single or multilayer feed forward but at least one intermediary layer has convolution functions - Convolution functions simplify a complex problem by breaking it into smaller units

- Supervised learning

- Excellent at image processing

- Examples: image recognition

Convolutional Neural Network

Recurrent Neural Network

- Cyclical rather than feed forward; intermediary layers feed output back into themselves

- Intermediary layers usually contain LSTM cells - An LSTM cell is a type of node that retains memory or state and uses that to alter the output

- Can “unroll” them to act like a multilayer feed-forward network

- Supervised or reinforcement learning

- Excellent at solving text or speech related problems

Modular Neural Network

- Neural network with two or more independent modules managed by an intermediary module

- Modules act completely independently and do not communicate

- Intermediary connects modules and organizes flow of data between them

- Can be very efficient as each module specializes in its own task and

Sequence to Sequence Model

- Network consists of encoder, intermediary, and decoder - Encoder converts inputs into an intermediary vector, often a map

- Decoder converts intermediary to output format, typically using an RNN

- Excellent when working with inputs and outputs that are of different lengths or formats (translation)

- Supervised or reinforcement learning

- Examples of image captioning, translation

Building a Machine Learning Program

Explore each of the steps taken to build a machine learning program from scratch


1. Figure out whether we can use machine learning to solve our problem and if so, what model type is best

2. Gather and format data

3. Build the computational graph

4. Train the model

5. Test and evaluate

6. Refactor and improve

Which Model to Use?

- We’ve seen that different types of models excel and solving particular kinds of problems efficiently - Knowing which algorithm to use is not always obvious and there are often multiple valid solutions

- Sometimes it’s best to try a few different structures and choose the one that performs best

- A good place to start is to research how similar problems are solved and follow a similar structure

Gathering Data

- Generally we want as much data as possible; the more examples of inputs and outputs the model sees, the better the model performance

- We want varied but relevant data; there isn’t much to gain from passing in data that has nothing to do with the task or passing in the same data point many times

- There are many datasets online for solving a variety of problems; many of these are free

Formatting Data

- In most models, data should be formatted the same - Take out the unnecessary parts as they could end up confounding the model

- Assign labels to each data point so each input has the “correct answer” label for the model to learn from - Divide data into training and testing datasets; typically we do about an 80-20 split (80% training, 20% testing) - Training and testing sets should be mutually exclusive to avoid a false sense of confidence

Building the Model Itself

- This is the computational graph that will take inputs, transform them, and produce outputs

- First, we add an input layer. We usually map each part of the input to a separate node in the layer

- The number and structure of the intermediate or hidden layers depends entirely on the type of model

- Finish with the output layer which contains one node for each point of output

Training a Model - Supervised

- This is where the model “learns” how to map inputs to outputs by adjusting internal values within nodes - We feed inputs and corresponding labels into the model - The model, based on its current weights and biases, produces some output for each input

- We measure the difference between model output (actual output) and the label (expected output) - The sum of all of the differences is called the error or loss

Training a Model - Supervised

- The goal here is to minimize the total loss as a smaller loss indicates that the actual outputs are getting closer to the expected outputs

- The model minimizes loss by adjusting the weights and biases for each node using various algorithms

- This ensures that the correct neurons are firing, leading inputs down the correct paths to the correct outputs - We usually run through the datasets multiple times to solidify the new values. Each time is called an epoch

Training a Model - Reinforcement

- This where the model learns how to perform as well as it can, typically by getting as far as it can

- We feed in environment, state, and reward function - The model, based on its current state and environment produces some output with each run

- How well the model performs is the total reward

Training a Model - Reinforcement

- The goal is to achieve the maximum possible reward - This could be getting as far as possible or reaching the end as fast as possible in the given environment - The model figures out which behaviours yield the maximum rewards (typically through trial and error) and learns to execute those behaviours more often over time - We typically run the model many many times (epochs), each time maintaining the state of the model so that it retains any improvements it has made

Testing the Model

- Once training is complete and we receive a sufficiently low loss, high enough accuracy, or high enough reward, we move onto testing

- We evaluate model performance by testing it on new data or by altering the environment

- The process is the same as training but we don’t change any values; we just run the data through and evaluate - It’s important to test with never before seen data to make sure the model is going to hold up in the real world

Refactoring the Model

- After training, we output a total loss and accuracy - If the loss is sufficiently low and the accuracy sufficiently high, we are done and can publish the model

- If we are unhappy with the performance we can try: - Gathering more data and retraining the model

- Changing the structure by modifying the number of nodes per layer, adding/removing layers, using different kinds of layers, etc.

- Try using a different type of model

Your Instructor

Nimish Narang
Nimish Narang

Nimish Narang is Mammoth Interactive's lead developer specializing in Python, iOS and Android. Primarily a coder, he also is an avid trader.

Mammoth Interactive is a leading online course provider in everything from learning to code to becoming a YouTube star. Mammoth Interactive courses have been featured on Harvard’s edX, Business Insider and more.

Over 11 years, Mammoth Interactive has built a global student community with 1.1 million courses sold. Mammoth Interactive has released over 250 courses and 2,500 hours of video content.

Founder and CEO John Bura has been programming since 1997 and teaching since 2002. John has created top-selling applications for iOS, Xbox and more. John also runs SaaS company Devonian Apps, building efficiency-minded software for technology workers like you.

Frequently Asked Questions

When does the course start and finish?
The course starts now and never ends! It is a completely self-paced online course - you decide when you start and when you finish.
How long do I have access to the course?
How does lifetime access sound? After enrolling, you have unlimited access to this course for as long as you like - across any and all devices you own.
What if I am unhappy with the course?
We would never want you to be unhappy! If you are unsatisfied with your purchase, contact us in the first 30 days and we will give you a full refund.

Get started now!