(Go: >> BACK << -|- >> HOME <<)

SlideShare a Scribd company logo
24st October 2017
Machine learning crash course
Questions to answer
1. What is meant by “machine learning” and “deep learning”?
2. Is deep learning with neural networks the best solution for
most problems now days or what else is there to use?
3. How much theory do I need to get started with my app or
service?
4. How do I get from idea to trained and deployed model? …
and things to consider
1 - Machine learning and deep learning
Machine learning is a way to make advanced statistical models using
math. It’s a way to make a computer guess.
Machine learning models are fantastic with access to good data.
However machine learning models can’t perform magic.
Garbage in = Garbage out
Type of learning Description Example uses
Supervised learning Modelling an specified
target/output variable. Each
example is “labelled”.
Classification (into categories) or
regression (on a numerical target).
Predict if a person will default on
their loan. Model the sale price of
an apartment.
Unsupervised learning No target/control signal. Find
structure in data. Clustering.
Divide movies or bands into
genres from user data.
Semi-supervised learning Mix of the above. Only part of
input data is labelled.
As for supervised learning but
with incomplete labeling.
Reinforcement learning A system explores its
environment, takes actions and
receives rewards. No explicit
control signal. Learning by doing.
Teach software to play a game
(checkers, Atari Breakout, …)
Active learning Techniques to select the training
examples that the algorithm could
learn the most from at a given
time.
As for supervised learning but
trying to optimize the learning
process.
Types of machine learning
Deep learning is a subset of ML
“Deep learning” is just a
rebranding of “neural networks”!
These in turn are just systems for
fitting an output to an input by
repeatedly applying linear
transformations to the inputs until
they match the outputs.
Basic idea of neural networks
The input data, your data points, are assigned to input
“nodes” in an input “layer”.
These are connected with weights to a “hidden” layer,
which in turn is connected to the next hidden layer, or the
output layer.
The values in the input layer are multiplied by a weight
and the resulting products are summed to give an
“activation value” in the hidden (or output) layer.
In the output layer, the activations are compared to the
desired activations, and the weights are adjusted based
on how big the mismatch is. This is the “learning” part.
Deep learning just means many layers like this plus
maybe more complicated patterns in how the weights are
connected to inputs.
Decision trees Linear or logistic regression
Classical ML methods
Random forests
Decision tree ensembles
Gaussian process regression (good at
handling uncertainty)
Lasso regression (looking for simple
models)
Elaborations on classical ML methods
2 - What does deep learning do well (better)?
What deep learning does well
Object recognition in images
What deep learning does well
Neural machine translation
Recurrent neural networks, sequence modelling
Self-learning for games
AlphaGo + AlphaGo Zero
Pure learning approach (no
heuristics)
Applications: Image captioning
Applications: style transfer
Style transfer
Applications where you probably don’t need deep learning
• “Tabular data”, i.e., you have a table with rows and columns, where
the variables (columns) are a mix of numerical and categorical
variables – often standard methods are enough
• You have a small number of training examples (e.g. a couple of
hundred or less)
• When you want to create an easily interpretable model or just make
a quick sanity check
• Often, end users are more interested in understanding which
variables are important rather than the model’s accuracy
… in other words, you will quite often not need it.
3 - How much theory do I need?
How much math do I need?
Of course it is better to know some math/theory but frankly, it is probably sufficient to have some
intuition on how each method works.
If you know your maths, it is easier to implement models from papers or your own models, but existing
frameworks are enough to do a lot already. There has never been a better time to get into machine
learning!
- Incredible amounts of tutorials on e.g. Github and Medium
- MOOCs:
Andrew Ng, Coursera and deeplearning.ai
Jeremy Howard, Fast.ai
cognitiveclass.ai
- Software frameworks such as scikit-learn for Python
DSX tutorials & articles https://apsportal.ibm.com/community
ML crash course
A way to practice: online contests
§ Largest online predictive modeling competition
platform
§ Founded 2010. Acquired by Google 2016
§ Companies or organizations define problems and
provide data; users compete for the best score. The
winner gets a money prize or in some case a job offer
• The leaderboard is motivating
• You can learn a lot from the discussion board
• Useful to learn and try out new techniques
• Learn not to overfit
Meetups
4 - How to go from idea to trained & deployed model?
Understand the goal.
What do you want to be able to predict or understand?
Can it be measured in a good way?
Do you have the data necessary to model it?
Actionability.
What is the next step if you get a good predictive model? Can you use it?
Are the variables that you use such that they can be easily adjusted?
Will end users be able to act on the results?
Data quality.
Can you extract the data in a good way?
Are the data complete? Are there missing/suspicious values?
Training data size and shape.
Do you have enough examples for training compared to the number of
variables (dimensions)? Do you have “wide” or “long” data?
Checklist for a machine learning idea
Process
Tools: open-source vs. proprietary
Open source
Proprietary
Data science tools
Project collaboration
Notebooks
Model deployment
scikit-learn
Tensorflow
Keras
caret
Mlbench
Shiny
Deploying machine learning models
Easiest way? – Watson ML (today’s demo), or equivalents on Azure (Microsoft), CloudML
(Google), ECS (Amazon) …
Tensorflow (as some others) has built-in serving capabilities (Tensorflow Serving).
Do-it-yourself web servers – often done using Flask (Python web server framework), or for
language-independent model deployment, OpenScoring (uses PMML).
For non-production-grade deployment, can use Shiny (R web app library), or Python
equivalents (Plotly) Dash, Bokeh, (IBM) PixieDust.
Yhat - https://www.yhat.com/products/scienceops - commercial model deployment
solution that hooks directly into R or Python
What was not covered in this talk
• Visualisation and exploratory data analysis (including dynamic
data exploration apps)
• Details on how different ML models work
• Case studies
Maybe next time? J
30

More Related Content

What's hot

EDF2013: Big Data Tutorial: Marko Grobelnik
EDF2013: Big Data Tutorial: Marko GrobelnikEDF2013: Big Data Tutorial: Marko Grobelnik
EDF2013: Big Data Tutorial: Marko Grobelnik
European Data Forum
 
Open Data, Big Data and Machine Learning
Open Data, Big Data and Machine LearningOpen Data, Big Data and Machine Learning
Open Data, Big Data and Machine Learning
Steven Van Vaerenbergh
 
Primer to Machine Learning
Primer to Machine LearningPrimer to Machine Learning
Primer to Machine Learning
Jeff Tanner
 
Introduction to Apache Mahout
Introduction to Apache MahoutIntroduction to Apache Mahout
Introduction to Apache Mahout
Edureka!
 
Machine Learning for Non-technical People
Machine Learning for Non-technical PeopleMachine Learning for Non-technical People
Machine Learning for Non-technical People
indico data
 
From Big Data to AI
From Big Data to AIFrom Big Data to AI
From Big Data to AI
Maloy Manna, PMP®
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
CloudxLab
 
H2O & Tensorflow - Fabrizio
H2O & Tensorflow - Fabrizio H2O & Tensorflow - Fabrizio
H2O & Tensorflow - Fabrizio
Sri Ambati
 
Machine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesMachine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & Opportunities
CodePolitan
 
Mapping (big) data science (15 dec2014)대학(원)생
Mapping (big) data science (15 dec2014)대학(원)생Mapping (big) data science (15 dec2014)대학(원)생
Mapping (big) data science (15 dec2014)대학(원)생
Han Woo PARK
 
Big data and AI presentation slides
Big data and AI presentation slidesBig data and AI presentation slides
Big data and AI presentation slides
CloudxLab
 
Materials for getting started with data science
Materials for getting started with data scienceMaterials for getting started with data science
Materials for getting started with data science
ihansel
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
Pruet Boonma
 
Norman Sasono - Incorporating AI/ML into Your Application Architecture
Norman Sasono - Incorporating AI/ML into Your Application ArchitectureNorman Sasono - Incorporating AI/ML into Your Application Architecture
Norman Sasono - Incorporating AI/ML into Your Application Architecture
Agile Impact
 
Putting the Magic in Data Science
Putting the Magic in Data SciencePutting the Magic in Data Science
Putting the Magic in Data Science
Sean Taylor
 
Human in the loop: a design pattern for managing teams working with ML
Human in the loop: a design pattern for managing  teams working with MLHuman in the loop: a design pattern for managing  teams working with ML
Human in the loop: a design pattern for managing teams working with ML
Paco Nathan
 
Introduction to Data Science and Large-scale Machine Learning
Introduction to Data Science and Large-scale Machine LearningIntroduction to Data Science and Large-scale Machine Learning
Introduction to Data Science and Large-scale Machine Learning
Nik Spirin
 
AI, Machine Learning and Deep Learning - The Overview
AI, Machine Learning and Deep Learning - The OverviewAI, Machine Learning and Deep Learning - The Overview
AI, Machine Learning and Deep Learning - The Overview
Spotle.ai
 
How to become a Data Scientist?
How to become a Data Scientist? How to become a Data Scientist?
How to become a Data Scientist?
HackerEarth
 
Mahout and Distributed Machine Learning 101
Mahout and Distributed Machine Learning 101Mahout and Distributed Machine Learning 101
Mahout and Distributed Machine Learning 101
John Ternent
 

What's hot (20)

EDF2013: Big Data Tutorial: Marko Grobelnik
EDF2013: Big Data Tutorial: Marko GrobelnikEDF2013: Big Data Tutorial: Marko Grobelnik
EDF2013: Big Data Tutorial: Marko Grobelnik
 
Open Data, Big Data and Machine Learning
Open Data, Big Data and Machine LearningOpen Data, Big Data and Machine Learning
Open Data, Big Data and Machine Learning
 
Primer to Machine Learning
Primer to Machine LearningPrimer to Machine Learning
Primer to Machine Learning
 
Introduction to Apache Mahout
Introduction to Apache MahoutIntroduction to Apache Mahout
Introduction to Apache Mahout
 
Machine Learning for Non-technical People
Machine Learning for Non-technical PeopleMachine Learning for Non-technical People
Machine Learning for Non-technical People
 
From Big Data to AI
From Big Data to AIFrom Big Data to AI
From Big Data to AI
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
H2O & Tensorflow - Fabrizio
H2O & Tensorflow - Fabrizio H2O & Tensorflow - Fabrizio
H2O & Tensorflow - Fabrizio
 
Machine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesMachine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & Opportunities
 
Mapping (big) data science (15 dec2014)대학(원)생
Mapping (big) data science (15 dec2014)대학(원)생Mapping (big) data science (15 dec2014)대학(원)생
Mapping (big) data science (15 dec2014)대학(원)생
 
Big data and AI presentation slides
Big data and AI presentation slidesBig data and AI presentation slides
Big data and AI presentation slides
 
Materials for getting started with data science
Materials for getting started with data scienceMaterials for getting started with data science
Materials for getting started with data science
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Norman Sasono - Incorporating AI/ML into Your Application Architecture
Norman Sasono - Incorporating AI/ML into Your Application ArchitectureNorman Sasono - Incorporating AI/ML into Your Application Architecture
Norman Sasono - Incorporating AI/ML into Your Application Architecture
 
Putting the Magic in Data Science
Putting the Magic in Data SciencePutting the Magic in Data Science
Putting the Magic in Data Science
 
Human in the loop: a design pattern for managing teams working with ML
Human in the loop: a design pattern for managing  teams working with MLHuman in the loop: a design pattern for managing  teams working with ML
Human in the loop: a design pattern for managing teams working with ML
 
Introduction to Data Science and Large-scale Machine Learning
Introduction to Data Science and Large-scale Machine LearningIntroduction to Data Science and Large-scale Machine Learning
Introduction to Data Science and Large-scale Machine Learning
 
AI, Machine Learning and Deep Learning - The Overview
AI, Machine Learning and Deep Learning - The OverviewAI, Machine Learning and Deep Learning - The Overview
AI, Machine Learning and Deep Learning - The Overview
 
How to become a Data Scientist?
How to become a Data Scientist? How to become a Data Scientist?
How to become a Data Scientist?
 
Mahout and Distributed Machine Learning 101
Mahout and Distributed Machine Learning 101Mahout and Distributed Machine Learning 101
Mahout and Distributed Machine Learning 101
 

Similar to ML crash course

An Introduction to Machine Learning
An Introduction to Machine LearningAn Introduction to Machine Learning
An Introduction to Machine Learning
Vedaj Padman
 
introduction to machine learning
introduction to machine learningintroduction to machine learning
introduction to machine learning
Johnson Ubah
 
Deep learning Introduction and Basics
Deep learning  Introduction and BasicsDeep learning  Introduction and Basics
Deep learning Introduction and Basics
Nitin Mishra
 
Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
Charmi Chokshi
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
pradeepskvch
 
Mis End Term Exam Theory Concepts
Mis End Term Exam Theory ConceptsMis End Term Exam Theory Concepts
Mis End Term Exam Theory Concepts
Vidya sagar Sharma
 
Machine Learning - Deep Learning
Machine Learning - Deep LearningMachine Learning - Deep Learning
Machine Learning - Deep Learning
Oluwasegun Matthew
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .ppt
butest
 
LearningAG.ppt
LearningAG.pptLearningAG.ppt
LearningAG.ppt
butest
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
Oluwasegun Matthew
 
Machine Learning Tutorial for Beginners
Machine Learning Tutorial for BeginnersMachine Learning Tutorial for Beginners
Machine Learning Tutorial for Beginners
grinu
 
How data science works and how can customers help
How data science works and how can customers helpHow data science works and how can customers help
How data science works and how can customers help
Danko Nikolic
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
HJ van Veen
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2
Roger Barga
 
Machine Can Think
Machine Can ThinkMachine Can Think
Machine Can Think
Rahul Jaiman
 
Week 4 advanced labeling, augmentation and data preprocessing
Week 4   advanced labeling, augmentation and data preprocessingWeek 4   advanced labeling, augmentation and data preprocessing
Week 4 advanced labeling, augmentation and data preprocessing
Ajay Taneja
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
Aun Akbar
 
lec1.ppt
lec1.pptlec1.ppt
lec1.ppt
SVasuKrishna1
 
machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdf
PranavPatil822557
 
Machine learning
Machine learningMachine learning
Machine learning
Abrar ali
 

Similar to ML crash course (20)

An Introduction to Machine Learning
An Introduction to Machine LearningAn Introduction to Machine Learning
An Introduction to Machine Learning
 
introduction to machine learning
introduction to machine learningintroduction to machine learning
introduction to machine learning
 
Deep learning Introduction and Basics
Deep learning  Introduction and BasicsDeep learning  Introduction and Basics
Deep learning Introduction and Basics
 
Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Mis End Term Exam Theory Concepts
Mis End Term Exam Theory ConceptsMis End Term Exam Theory Concepts
Mis End Term Exam Theory Concepts
 
Machine Learning - Deep Learning
Machine Learning - Deep LearningMachine Learning - Deep Learning
Machine Learning - Deep Learning
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .ppt
 
LearningAG.ppt
LearningAG.pptLearningAG.ppt
LearningAG.ppt
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Machine Learning Tutorial for Beginners
Machine Learning Tutorial for BeginnersMachine Learning Tutorial for Beginners
Machine Learning Tutorial for Beginners
 
How data science works and how can customers help
How data science works and how can customers helpHow data science works and how can customers help
How data science works and how can customers help
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2
 
Machine Can Think
Machine Can ThinkMachine Can Think
Machine Can Think
 
Week 4 advanced labeling, augmentation and data preprocessing
Week 4   advanced labeling, augmentation and data preprocessingWeek 4   advanced labeling, augmentation and data preprocessing
Week 4 advanced labeling, augmentation and data preprocessing
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
lec1.ppt
lec1.pptlec1.ppt
lec1.ppt
 
machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdf
 
Machine learning
Machine learningMachine learning
Machine learning
 

More from mikaelhuss

Deep learning with Tensorflow in R
Deep learning with Tensorflow in RDeep learning with Tensorflow in R
Deep learning with Tensorflow in R
mikaelhuss
 
Data analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomics
mikaelhuss
 
Comparing public RNA-seq data
Comparing public RNA-seq dataComparing public RNA-seq data
Comparing public RNA-seq data
mikaelhuss
 
RNA-seq differential expression analysis
RNA-seq differential expression analysisRNA-seq differential expression analysis
RNA-seq differential expression analysis
mikaelhuss
 
RNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingRNA-seq quality control and pre-processing
RNA-seq quality control and pre-processing
mikaelhuss
 
Emerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsEmerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomics
mikaelhuss
 
Data analytics challenges in genomics
Data analytics challenges in genomicsData analytics challenges in genomics
Data analytics challenges in genomics
mikaelhuss
 

More from mikaelhuss (7)

Deep learning with Tensorflow in R
Deep learning with Tensorflow in RDeep learning with Tensorflow in R
Deep learning with Tensorflow in R
 
Data analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomics
 
Comparing public RNA-seq data
Comparing public RNA-seq dataComparing public RNA-seq data
Comparing public RNA-seq data
 
RNA-seq differential expression analysis
RNA-seq differential expression analysisRNA-seq differential expression analysis
RNA-seq differential expression analysis
 
RNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingRNA-seq quality control and pre-processing
RNA-seq quality control and pre-processing
 
Emerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsEmerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomics
 
Data analytics challenges in genomics
Data analytics challenges in genomicsData analytics challenges in genomics
Data analytics challenges in genomics
 

Recently uploaded

Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
weiwchu
 
Where to order Frederick Community College diploma?
Where to order Frederick Community College diploma?Where to order Frederick Community College diploma?
Where to order Frederick Community College diploma?
SomalyEng
 
Cyber Insurance Mathematical Model & Pricing
Cyber Insurance Mathematical Model & PricingCyber Insurance Mathematical Model & Pricing
Cyber Insurance Mathematical Model & Pricing
BaraDaniel1
 
The Rise of Python in Finance,Automating Trading Strategies: _.pdf
The Rise of Python in Finance,Automating Trading Strategies: _.pdfThe Rise of Python in Finance,Automating Trading Strategies: _.pdf
The Rise of Python in Finance,Automating Trading Strategies: _.pdf
Riya Sen
 
Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)
Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)
Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)
Alireza Kamrani
 
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdfWhy_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
Alexander Teggin
 
AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408
AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408
AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408
Grant McAlister
 
Acid Base Practice Test 4- KEY.pdfkkjkjk
Acid Base Practice Test 4- KEY.pdfkkjkjkAcid Base Practice Test 4- KEY.pdfkkjkjk
Acid Base Practice Test 4- KEY.pdfkkjkjk
talha2khan2k
 
History and Application of LLM Leveraging Big Data
History and Application of LLM Leveraging Big DataHistory and Application of LLM Leveraging Big Data
History and Application of LLM Leveraging Big Data
Jongwook Woo
 
Practical Research for grade 12 students
Practical Research for grade 12 studentsPractical Research for grade 12 students
Practical Research for grade 12 students
juliaaaaana10
 
Python knowledge ,......................
Python knowledge ,......................Python knowledge ,......................
Python knowledge ,......................
sabith777a
 
Training on CSPro and step by steps.pptx
Training on CSPro and step by steps.pptxTraining on CSPro and step by steps.pptx
Training on CSPro and step by steps.pptx
lenjisoHussein
 
Field Diary and lab record, Importance.pdf
Field Diary and lab record, Importance.pdfField Diary and lab record, Importance.pdf
Field Diary and lab record, Importance.pdf
hritikbui
 
Towards an Analysis-Ready, Cloud-Optimised service for FAIR fusion data
Towards an Analysis-Ready, Cloud-Optimised service for FAIR fusion dataTowards an Analysis-Ready, Cloud-Optimised service for FAIR fusion data
Towards an Analysis-Ready, Cloud-Optimised service for FAIR fusion data
Samuel Jackson
 
Histology of Muscle types histology o.ppt
Histology of Muscle types histology o.pptHistology of Muscle types histology o.ppt
Histology of Muscle types histology o.ppt
SamanArshad11
 
Cyber Insurance Mathematical Model & Pricing 2
Cyber Insurance Mathematical Model & Pricing 2Cyber Insurance Mathematical Model & Pricing 2
Cyber Insurance Mathematical Model & Pricing 2
BaraDaniel1
 
UNITEC Institute of Technology diploma
UNITEC Institute of Technology diplomaUNITEC Institute of Technology diploma
UNITEC Institute of Technology diploma
oyhka
 
Parcel Delivery - Intel Segmentation and Last Mile Opt.pptx
Parcel Delivery - Intel Segmentation and Last Mile Opt.pptxParcel Delivery - Intel Segmentation and Last Mile Opt.pptx
Parcel Delivery - Intel Segmentation and Last Mile Opt.pptx
AltanAtabarut
 
Aws MLOps Interview Questions with answers
Aws MLOps Interview Questions  with answersAws MLOps Interview Questions  with answers
Aws MLOps Interview Questions with answers
Sathiakumar Chandr
 
future-of-asset-management-future-of-asset-management
future-of-asset-management-future-of-asset-managementfuture-of-asset-management-future-of-asset-management
future-of-asset-management-future-of-asset-management
Aadee4
 

Recently uploaded (20)

Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
 
Where to order Frederick Community College diploma?
Where to order Frederick Community College diploma?Where to order Frederick Community College diploma?
Where to order Frederick Community College diploma?
 
Cyber Insurance Mathematical Model & Pricing
Cyber Insurance Mathematical Model & PricingCyber Insurance Mathematical Model & Pricing
Cyber Insurance Mathematical Model & Pricing
 
The Rise of Python in Finance,Automating Trading Strategies: _.pdf
The Rise of Python in Finance,Automating Trading Strategies: _.pdfThe Rise of Python in Finance,Automating Trading Strategies: _.pdf
The Rise of Python in Finance,Automating Trading Strategies: _.pdf
 
Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)
Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)
Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)
 
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdfWhy_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
 
AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408
AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408
AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408
 
Acid Base Practice Test 4- KEY.pdfkkjkjk
Acid Base Practice Test 4- KEY.pdfkkjkjkAcid Base Practice Test 4- KEY.pdfkkjkjk
Acid Base Practice Test 4- KEY.pdfkkjkjk
 
History and Application of LLM Leveraging Big Data
History and Application of LLM Leveraging Big DataHistory and Application of LLM Leveraging Big Data
History and Application of LLM Leveraging Big Data
 
Practical Research for grade 12 students
Practical Research for grade 12 studentsPractical Research for grade 12 students
Practical Research for grade 12 students
 
Python knowledge ,......................
Python knowledge ,......................Python knowledge ,......................
Python knowledge ,......................
 
Training on CSPro and step by steps.pptx
Training on CSPro and step by steps.pptxTraining on CSPro and step by steps.pptx
Training on CSPro and step by steps.pptx
 
Field Diary and lab record, Importance.pdf
Field Diary and lab record, Importance.pdfField Diary and lab record, Importance.pdf
Field Diary and lab record, Importance.pdf
 
Towards an Analysis-Ready, Cloud-Optimised service for FAIR fusion data
Towards an Analysis-Ready, Cloud-Optimised service for FAIR fusion dataTowards an Analysis-Ready, Cloud-Optimised service for FAIR fusion data
Towards an Analysis-Ready, Cloud-Optimised service for FAIR fusion data
 
Histology of Muscle types histology o.ppt
Histology of Muscle types histology o.pptHistology of Muscle types histology o.ppt
Histology of Muscle types histology o.ppt
 
Cyber Insurance Mathematical Model & Pricing 2
Cyber Insurance Mathematical Model & Pricing 2Cyber Insurance Mathematical Model & Pricing 2
Cyber Insurance Mathematical Model & Pricing 2
 
UNITEC Institute of Technology diploma
UNITEC Institute of Technology diplomaUNITEC Institute of Technology diploma
UNITEC Institute of Technology diploma
 
Parcel Delivery - Intel Segmentation and Last Mile Opt.pptx
Parcel Delivery - Intel Segmentation and Last Mile Opt.pptxParcel Delivery - Intel Segmentation and Last Mile Opt.pptx
Parcel Delivery - Intel Segmentation and Last Mile Opt.pptx
 
Aws MLOps Interview Questions with answers
Aws MLOps Interview Questions  with answersAws MLOps Interview Questions  with answers
Aws MLOps Interview Questions with answers
 
future-of-asset-management-future-of-asset-management
future-of-asset-management-future-of-asset-managementfuture-of-asset-management-future-of-asset-management
future-of-asset-management-future-of-asset-management
 

ML crash course

  • 1. 24st October 2017 Machine learning crash course
  • 2. Questions to answer 1. What is meant by “machine learning” and “deep learning”? 2. Is deep learning with neural networks the best solution for most problems now days or what else is there to use? 3. How much theory do I need to get started with my app or service? 4. How do I get from idea to trained and deployed model? … and things to consider
  • 3. 1 - Machine learning and deep learning
  • 4. Machine learning is a way to make advanced statistical models using math. It’s a way to make a computer guess. Machine learning models are fantastic with access to good data. However machine learning models can’t perform magic. Garbage in = Garbage out
  • 5. Type of learning Description Example uses Supervised learning Modelling an specified target/output variable. Each example is “labelled”. Classification (into categories) or regression (on a numerical target). Predict if a person will default on their loan. Model the sale price of an apartment. Unsupervised learning No target/control signal. Find structure in data. Clustering. Divide movies or bands into genres from user data. Semi-supervised learning Mix of the above. Only part of input data is labelled. As for supervised learning but with incomplete labeling. Reinforcement learning A system explores its environment, takes actions and receives rewards. No explicit control signal. Learning by doing. Teach software to play a game (checkers, Atari Breakout, …) Active learning Techniques to select the training examples that the algorithm could learn the most from at a given time. As for supervised learning but trying to optimize the learning process. Types of machine learning
  • 6. Deep learning is a subset of ML “Deep learning” is just a rebranding of “neural networks”! These in turn are just systems for fitting an output to an input by repeatedly applying linear transformations to the inputs until they match the outputs.
  • 7. Basic idea of neural networks The input data, your data points, are assigned to input “nodes” in an input “layer”. These are connected with weights to a “hidden” layer, which in turn is connected to the next hidden layer, or the output layer. The values in the input layer are multiplied by a weight and the resulting products are summed to give an “activation value” in the hidden (or output) layer. In the output layer, the activations are compared to the desired activations, and the weights are adjusted based on how big the mismatch is. This is the “learning” part. Deep learning just means many layers like this plus maybe more complicated patterns in how the weights are connected to inputs.
  • 8. Decision trees Linear or logistic regression Classical ML methods
  • 9. Random forests Decision tree ensembles Gaussian process regression (good at handling uncertainty) Lasso regression (looking for simple models) Elaborations on classical ML methods
  • 10. 2 - What does deep learning do well (better)?
  • 11. What deep learning does well Object recognition in images
  • 12. What deep learning does well Neural machine translation Recurrent neural networks, sequence modelling
  • 14. AlphaGo + AlphaGo Zero Pure learning approach (no heuristics)
  • 17. Applications where you probably don’t need deep learning • “Tabular data”, i.e., you have a table with rows and columns, where the variables (columns) are a mix of numerical and categorical variables – often standard methods are enough • You have a small number of training examples (e.g. a couple of hundred or less) • When you want to create an easily interpretable model or just make a quick sanity check • Often, end users are more interested in understanding which variables are important rather than the model’s accuracy … in other words, you will quite often not need it.
  • 18. 3 - How much theory do I need?
  • 19. How much math do I need? Of course it is better to know some math/theory but frankly, it is probably sufficient to have some intuition on how each method works. If you know your maths, it is easier to implement models from papers or your own models, but existing frameworks are enough to do a lot already. There has never been a better time to get into machine learning! - Incredible amounts of tutorials on e.g. Github and Medium - MOOCs: Andrew Ng, Coursera and deeplearning.ai Jeremy Howard, Fast.ai cognitiveclass.ai - Software frameworks such as scikit-learn for Python DSX tutorials & articles https://apsportal.ibm.com/community
  • 21. A way to practice: online contests § Largest online predictive modeling competition platform § Founded 2010. Acquired by Google 2016 § Companies or organizations define problems and provide data; users compete for the best score. The winner gets a money prize or in some case a job offer
  • 22. • The leaderboard is motivating • You can learn a lot from the discussion board • Useful to learn and try out new techniques • Learn not to overfit
  • 24. 4 - How to go from idea to trained & deployed model?
  • 25. Understand the goal. What do you want to be able to predict or understand? Can it be measured in a good way? Do you have the data necessary to model it? Actionability. What is the next step if you get a good predictive model? Can you use it? Are the variables that you use such that they can be easily adjusted? Will end users be able to act on the results? Data quality. Can you extract the data in a good way? Are the data complete? Are there missing/suspicious values? Training data size and shape. Do you have enough examples for training compared to the number of variables (dimensions)? Do you have “wide” or “long” data? Checklist for a machine learning idea
  • 27. Tools: open-source vs. proprietary Open source Proprietary Data science tools Project collaboration Notebooks Model deployment scikit-learn Tensorflow Keras caret Mlbench Shiny
  • 28. Deploying machine learning models Easiest way? – Watson ML (today’s demo), or equivalents on Azure (Microsoft), CloudML (Google), ECS (Amazon) … Tensorflow (as some others) has built-in serving capabilities (Tensorflow Serving). Do-it-yourself web servers – often done using Flask (Python web server framework), or for language-independent model deployment, OpenScoring (uses PMML). For non-production-grade deployment, can use Shiny (R web app library), or Python equivalents (Plotly) Dash, Bokeh, (IBM) PixieDust. Yhat - https://www.yhat.com/products/scienceops - commercial model deployment solution that hooks directly into R or Python
  • 29. What was not covered in this talk • Visualisation and exploratory data analysis (including dynamic data exploration apps) • Details on how different ML models work • Case studies Maybe next time? J
  • 30. 30