(Go: >> BACK << -|- >> HOME <<)

SlideShare a Scribd company logo
Introduction to
DATA SCIENCE
Challenges deep-dive
Why the Hype Around
Data Science?
● The demand for data scientists will soar by 28% by 2023
● Data scientist roles have grown over 650% since 2012, but
currently, 35,000 people in the US have data science skills,
while hundreds of companies are hiring for those roles.
● Software engineering is a common starting point for
professionals who are in the top five fasting growing jobs today.
● Data Science gives you career flexibility
Who are Data Scientist?
Challenges deep-dive
What is Machine
Learning ?
Machine learning teaches computers to do what comes naturally to
humans and animals: learn from experience. Machine learning
algorithms use computational methods to “learn” information directly
from data without relying on a predetermined equation as a model.
The algorithms adaptively improve their performance as the number
of samples available for learning increases.
Challenges deep-dive
A Definition
A computer program is said to learn from experience E with
respect to some task T and some performance measure P if its
performance on T, as measured by P, improves with experience E.
-Tom Mitchell
Challenges deep-dive
A Small Question
Suppose we feed a learning algorithm a lot of historical weather
data, and have it learn to predict weather. In this setting, what is
T,P,E?
More Data,
More Questions,
Better Answers
Challenges deep-dive
Machine learning teaches computers to do what comes naturally to
humans and animals: learn from experience. Machine learning
algorithms use computational methods to “learn” information directly
from data without relying on a predetermined equation as a model.
The algorithms adaptively improve their performance as the number
of samples available for learning increases.
Real World
Applications
With the rise in big data, machine learning has become particularly
important for solving problems in areas like these:
● Image processing and computer vision,for face recognition,
motion detection, and object detection
● Computational biology, for tumor detection, drug discovery, and
DNA sequencing
● Energy production, for price and load forecasting
● Automotive, aerospace, and manufacturing, for predictive
maintenance
● Natural language processing
Challenges deep-dive
How Machine
Learning Works
Machine learning uses two types of techniques:
● Supervised learning, which trains a model on known input and
output data so that it can predict future outputs
● Unsupervised learning, which finds hidden patterns or intrinsic
structures in input data.
Machine Learning
Techniques
Challenges deep-dive
Supervised
Learning
The aim of supervised machine learning is to build a model that
makes predictions based on evidence in the presence of
uncertainty. A supervised learning algorithm takes a known set of
input data and known responses to the data (output) and trains a
model to generate reasonable predictions for the response to new
data
Classification - predict discrete responses
Classification models classify input data into categories.for
example, whether an email is genuine or spam, or whether a tumor
is cancerous or benign.
Regression - predict continuous responses
for example, changes in temperature or fluctuations in power
demand. Typical applications include electricity load forecasting and
algorithmic trading.
Challenges deep-dive
Unsupervised
Learning
Unsupervised learning finds hidden patterns or intrinsic structures in
data. It is used to draw inferences from dataset consisting of input
data without labeled responses.
Clustering is the most common unsupervised learning technique. It
is used for exploratory data analysis to find hidden patterns or
groupings in data.Applications for clustering include gene sequence
analysis,market research, and object recognition
Knowledge Test
Which of the following would you apply supervised learning to?
1. Given genetic (DNA) data from a person, predict the odds of him/her developing
diabetes over the next 10 years.
2. Given a large dataset of medical records from patients suffering from heart
disease, try to learn whether there might be different clusters of such patients for
which we might tailor separate treatments.
3. Given data on how 1000 medical patients respond to an experimental drug (such
as effectiveness of the treatment, side effects, etc.), discover whether there are
different categories or "types" of patients in terms of how they respond to the
drug, and if so what these categories are.
4. Have a computer examine an audio clip of a piece of music, and classify whether
or not there are vocals (i.e., a human voice singing) in that audio clip, or if it is a
clip of only musical instruments (and no vocals).
Knowledge Test
Which of the following questions can be answered using a
classification algorithm?
1. How does the exchange rate depend on the GDP?
2. Does a document contain the handwritten letter S?
3. How can I group supermarket products using purchase
frequency?
Knowledge Test
1. Suppose you are working on weather prediction, and you
would like to predict whether or not it will be raining at 5pm
tomorrow. You want to use a learning algorithm for this.Would
you treat this as a classification or a regression problem?
2. Suppose you are working on stock market prediction. You
would like to predict whether or not a certain company will
declare bankruptcy within the next 7 days (by training on data
of similar companies that had previously been at risk of
bankruptcy). Would you treat this as a classification or a
regression problem?
How Do You
Decide Which
Algorithm
to Use?
Choosing the right algorithm can seem overwhelming
There are dozens of supervised and unsupervised machine
learning algorithms, and each takes a different approach to
learning.
There is no best method or one size fits all. Finding the right
algorithm is partly just trial and error
But algorithm selection also depends on the size and type of data
you’re working with, the insights you want to get from the data, and
how those insights will be used.
Two - Class Classification
Multi - Class Classification
Anomaly Detection
Regression
Clustering
Challenges deep-dive
When should we use
Machine Learning
Consider using machine learning when you have a complex task or
problem involving a large amount of data and lots of variables, but
no existing formula or equation.
Knowledge Test
Have a look at the statements below and identify the one which
is not a machine learning problem
1. Given a viewer's shopping habits, recommend a product to
purchase the next time she visits your website.
2. Given the symptoms of a patient, identify her illness.
3. Predict the USD/EUR exchange rate for February 2023.
4. Compute the mean wage of 10 employees for your company.
Knowledge Test
Which of the following statements uses a machine learning
model?
1. Determine whether an incoming email is spam or not
2. Obtain the name of last year's FIFIA Ballon d’Or champion
3. Automatically tagging your new Facebook photos
4. Select the student with the highest grade on a statistics course
Getting
Started
Challenges deep-dive
There is NO
Straight Line
With machine learning there’s rarely a straight line from start to
finish. You’ll find yourself constantly iterating and trying different
ideas and approaches
Challenges deep-dive
Machine learning teaches computers to do what comes naturally to
humans and animals: learn from experience. Machine learning
algorithms use computational methods to “learn” information directly
from data without relying on a predetermined equation as a model.
The algorithms adaptively improve their performance as the number
of samples available for learning increases.
Machine Learning
Challenges
● Data comes in all shapes and sizes
● Preprocessing your data might require specialized knowledge
and tools
● It takes time to find the best model to fit the data.
Challenges deep-dive
Machine learning teaches computers to do what comes naturally to
humans and animals: learn from experience. Machine learning
algorithms use computational methods to “learn” information directly
from data without relying on a predetermined equation as a model.
The algorithms adaptively improve their performance as the number
of samples available for learning increases.
Questions to Ask
Before Starting
Every machine learning workflow begins with three questions:
● What kind of data are you working with?
● What insights do you want to get from it?
● How and where will those insights be applied?
Your answers to these questions help you decide whether to use
supervised or unsupervised learning.
Challenges deep-dive
Machine learning teaches computers to do what comes naturally to
humans and animals: learn from experience. Machine learning
algorithms use computational methods to “learn” information directly
from data without relying on a predetermined equation as a model.
The algorithms adaptively improve their performance as the number
of samples available for learning increases.
Data Science -
Five Questions
There are only five questions that data science answers:
● Is this A or B?
● Is this weird?
● How much – or – How many?
● How is this organized?
● What should I do next?
Knowledge Test
Which of the following questions can be answered using a
classification algorithm?
1. How does the exchange rate depend on the GDP?
2. Does a document contain the handwritten letter S?
3. How can I group supermarket products using purchase
frequency?
Workflow at a Glance
Challenges deep-dive
Machine learning teaches computers to do what comes naturally to
humans and animals: learn from experience. Machine learning
algorithms use computational methods to “learn” information directly
from data without relying on a predetermined equation as a model.
The algorithms adaptively improve their performance as the number
of samples available for learning increases.
Step 1 -
Load the Data
We store the labeled data sets in a text file. A flat file format such as
text or CSV is easy to work with and makes it straightforward to
import data.
Machine learning algorithms aren’t smart enough to tell the
difference between noise and valuable information. Before using the
data for training, we need to make sure it’s clean and complete
Challenges deep-dive
Machine learning teaches computers to do what comes naturally to
humans and animals: learn from experience. Machine learning
algorithms use computational methods to “learn” information directly
from data without relying on a predetermined equation as a model.
The algorithms adaptively improve their performance as the number
of samples available for learning increases.
Step 2 -
Preprocess the Data
To preprocess the data we do the following:
● Look for outliers–data points that lie outside the rest of the data
● Check for missing values
● Divide the data into two sets
○ We save part of the data for testing (the test set) and use
the rest (the training set) to build models. This is referred
to as holdout, and is a useful cross-validation technique
Challenges deep-dive
Machine learning teaches computers to do what comes naturally to
humans and animals: learn from experience. Machine learning
algorithms use computational methods to “learn” information directly
from data without relying on a predetermined equation as a model.
The algorithms adaptively improve their performance as the number
of samples available for learning increases.
Step 3 -
Derive Features
Deriving features (also known as feature engineering or feature
extraction) turns raw data into information that a machine learning
algorithm can use.
Use feature selection to:
• Improve the accuracy of a machine learning algorithm
• Boost model performance for high-dimensional data sets
• Improve model interpretability
• Prevent overfitting
Challenges deep-dive
Machine learning teaches computers to do what comes naturally to
humans and animals: learn from experience. Machine learning
algorithms use computational methods to “learn” information directly
from data without relying on a predetermined equation as a model.
The algorithms adaptively improve their performance as the number
of samples available for learning increases.
Step 4 -
Build and Train Model
● The predefined algorithms and the test data are used for
building the model.
● The training data is used to train and evaluate the model
Challenges deep-dive
Machine learning teaches computers to do what comes naturally to
humans and animals: learn from experience. Machine learning
algorithms use computational methods to “learn” information directly
from data without relying on a predetermined equation as a model.
The algorithms adaptively improve their performance as the number
of samples available for learning increases.
Step 5 -
Improve the Model
Improving a model can take two different directions: make the
model simpler or add complexity.
Simplify - reduce the number of features
Add Complexity - make it more fine-tuned
Simplify
Popular feature reduction techniques include:
● Correlation matrix – shows the relationship between
variables, so that variables (or features) that are not highly
correlated can be removed.
● Principal component analysis (PCA) - eliminates redundancy
by finding a combination of features that captures key
distinctions between the original features and brings out strong
patterns in the dataset.
● Sequential feature reduction – reduces features iteratively on
the model until there is no improvement in performance
Add Complexity
● Use model combination – merge multiple simpler models into
a larger model that is better able to represent the trends in the
data than any of the simpler models could on their own.
● Add more data sources
TO DO
● Getting Started
● Familiarize with Maths and
Algorithms
● Select the Infrastructure or
Tool
● Create your profile and
participate in competition
Christy Abraham Joy
Email - christyabrahamjoy@gmail.com
Mob - +91 94000 95273
Feel Free to Contact!

More Related Content

What's hot

ChatGPT What It Is and How Writers Can Use It.pdf
ChatGPT What It Is and How Writers Can Use It.pdfChatGPT What It Is and How Writers Can Use It.pdf
ChatGPT What It Is and How Writers Can Use It.pdf
Adsy
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
DianaGray10
 
Unlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfUnlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdf
PremNaraindas1
 
How AI is going to change the world _M.Mujeeb Riaz.pdf
How AI is going to change the world _M.Mujeeb Riaz.pdfHow AI is going to change the world _M.Mujeeb Riaz.pdf
How AI is going to change the world _M.Mujeeb Riaz.pdf
Mujeeb Riaz
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Laguna State Polytechnic University
 
AI FOR BUSINESS LEADERS
AI FOR BUSINESS LEADERSAI FOR BUSINESS LEADERS
AI FOR BUSINESS LEADERS
Andre Muscat
 
Using AI chatbots for deep learning and teaching with specific examples to en...
Using AI chatbots for deep learning and teaching with specific examples to en...Using AI chatbots for deep learning and teaching with specific examples to en...
Using AI chatbots for deep learning and teaching with specific examples to en...
Nigel Daly
 
Making Sense of Analytics
Making Sense of AnalyticsMaking Sense of Analytics
Making Sense of Analytics
Dana DiTomaso
 
Data science and Artificial Intelligence
Data science and Artificial IntelligenceData science and Artificial Intelligence
Data science and Artificial Intelligence
Suman Srinivasan
 
How to Use AI (Like ChatGPT & Bard) in your SEO & Content - A Comprehensive S...
How to Use AI (Like ChatGPT & Bard) in your SEO & Content - A Comprehensive S...How to Use AI (Like ChatGPT & Bard) in your SEO & Content - A Comprehensive S...
How to Use AI (Like ChatGPT & Bard) in your SEO & Content - A Comprehensive S...
Volume Nine
 
Chat GPT Intoduction.pdf
Chat GPT Intoduction.pdfChat GPT Intoduction.pdf
Chat GPT Intoduction.pdf
Thiyagu K
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
Javaria Chiragh
 
Fight for Yourself: How to Sell Your Ideas and Crush Presentations
Fight for Yourself: How to Sell Your Ideas and Crush PresentationsFight for Yourself: How to Sell Your Ideas and Crush Presentations
Fight for Yourself: How to Sell Your Ideas and Crush Presentations
Digital Surgeons
 
The Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The FutureThe Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The Future
Arturo Pelayo
 
ppt about chatgpt.pptx
ppt about chatgpt.pptxppt about chatgpt.pptx
ppt about chatgpt.pptx
Srinivas237938
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
Alireza Esmikhani
 
Unlocking the Power of ChatGPT
Unlocking the Power of ChatGPTUnlocking the Power of ChatGPT
Unlocking the Power of ChatGPT
Kristine Schachinger SEO and Online Marketing
 
What is chat gpt
What is chat gptWhat is chat gpt
What is chat gpt
Home
 
How to get things done - Lessons from Yahoo, Google, Netflix and Meta
How to get things done - Lessons from Yahoo, Google, Netflix and Meta How to get things done - Lessons from Yahoo, Google, Netflix and Meta
How to get things done - Lessons from Yahoo, Google, Netflix and Meta
Ido Green
 
Generative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptxGenerative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptx
Colleen Farrelly
 

What's hot (20)

ChatGPT What It Is and How Writers Can Use It.pdf
ChatGPT What It Is and How Writers Can Use It.pdfChatGPT What It Is and How Writers Can Use It.pdf
ChatGPT What It Is and How Writers Can Use It.pdf
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
 
Unlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfUnlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdf
 
How AI is going to change the world _M.Mujeeb Riaz.pdf
How AI is going to change the world _M.Mujeeb Riaz.pdfHow AI is going to change the world _M.Mujeeb Riaz.pdf
How AI is going to change the world _M.Mujeeb Riaz.pdf
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
AI FOR BUSINESS LEADERS
AI FOR BUSINESS LEADERSAI FOR BUSINESS LEADERS
AI FOR BUSINESS LEADERS
 
Using AI chatbots for deep learning and teaching with specific examples to en...
Using AI chatbots for deep learning and teaching with specific examples to en...Using AI chatbots for deep learning and teaching with specific examples to en...
Using AI chatbots for deep learning and teaching with specific examples to en...
 
Making Sense of Analytics
Making Sense of AnalyticsMaking Sense of Analytics
Making Sense of Analytics
 
Data science and Artificial Intelligence
Data science and Artificial IntelligenceData science and Artificial Intelligence
Data science and Artificial Intelligence
 
How to Use AI (Like ChatGPT & Bard) in your SEO & Content - A Comprehensive S...
How to Use AI (Like ChatGPT & Bard) in your SEO & Content - A Comprehensive S...How to Use AI (Like ChatGPT & Bard) in your SEO & Content - A Comprehensive S...
How to Use AI (Like ChatGPT & Bard) in your SEO & Content - A Comprehensive S...
 
Chat GPT Intoduction.pdf
Chat GPT Intoduction.pdfChat GPT Intoduction.pdf
Chat GPT Intoduction.pdf
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
Fight for Yourself: How to Sell Your Ideas and Crush Presentations
Fight for Yourself: How to Sell Your Ideas and Crush PresentationsFight for Yourself: How to Sell Your Ideas and Crush Presentations
Fight for Yourself: How to Sell Your Ideas and Crush Presentations
 
The Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The FutureThe Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The Future
 
ppt about chatgpt.pptx
ppt about chatgpt.pptxppt about chatgpt.pptx
ppt about chatgpt.pptx
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
Unlocking the Power of ChatGPT
Unlocking the Power of ChatGPTUnlocking the Power of ChatGPT
Unlocking the Power of ChatGPT
 
What is chat gpt
What is chat gptWhat is chat gpt
What is chat gpt
 
How to get things done - Lessons from Yahoo, Google, Netflix and Meta
How to get things done - Lessons from Yahoo, Google, Netflix and Meta How to get things done - Lessons from Yahoo, Google, Netflix and Meta
How to get things done - Lessons from Yahoo, Google, Netflix and Meta
 
Generative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptxGenerative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptx
 

Similar to Introduction to Data Science

Introduction To Machine Learning
Introduction To Machine LearningIntroduction To Machine Learning
Introduction To Machine Learning
Knoldus Inc.
 
what-is-machine-learning-and-its-importance-in-todays-world.pdf
what-is-machine-learning-and-its-importance-in-todays-world.pdfwhat-is-machine-learning-and-its-importance-in-todays-world.pdf
what-is-machine-learning-and-its-importance-in-todays-world.pdf
Temok IT Services
 
INTERNSHIP ON MAcHINE LEARNING.pptx
INTERNSHIP ON MAcHINE LEARNING.pptxINTERNSHIP ON MAcHINE LEARNING.pptx
INTERNSHIP ON MAcHINE LEARNING.pptx
srikanthkallem1
 
Machine learning applications nurturing growth of various business domains
Machine learning applications nurturing growth of various business domainsMachine learning applications nurturing growth of various business domains
Machine learning applications nurturing growth of various business domains
Shrutika Oswal
 
Machine Learning Ch 1.ppt
Machine Learning Ch 1.pptMachine Learning Ch 1.ppt
Machine Learning Ch 1.ppt
ARVIND SARDAR
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Amit Kumar
 
BIG DATA AND MACHINE LEARNING
BIG DATA AND MACHINE LEARNINGBIG DATA AND MACHINE LEARNING
BIG DATA AND MACHINE LEARNING
Umair Shafique
 
introduction to machine learning
introduction to machine learningintroduction to machine learning
introduction to machine learning
Johnson Ubah
 
detailed Presentation on supervised learning
 detailed Presentation on supervised learning detailed Presentation on supervised learning
detailed Presentation on supervised learning
ZAMANCHBWN
 
AI.pdf
AI.pdfAI.pdf
AI.pdf
Tariqqandeel
 
machine_learning_section1_ebook.pdf
machine_learning_section1_ebook.pdfmachine_learning_section1_ebook.pdf
machine_learning_section1_ebook.pdf
agfi
 
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
Madhav Mishra
 
Training_Report_on_Machine_Learning.docx
Training_Report_on_Machine_Learning.docxTraining_Report_on_Machine_Learning.docx
Training_Report_on_Machine_Learning.docx
ShubhamBishnoi14
 
How to build machine learning apps.pdf
How to build machine learning apps.pdfHow to build machine learning apps.pdf
How to build machine learning apps.pdf
StephenAmell4
 
How to build machine learning apps.pdf
How to build machine learning apps.pdfHow to build machine learning apps.pdf
How to build machine learning apps.pdf
AnastasiaSteele10
 
How to build machine learning apps.pdf
How to build machine learning apps.pdfHow to build machine learning apps.pdf
How to build machine learning apps.pdf
JamieDornan2
 
How to build machine learning apps.pdf
How to build machine learning apps.pdfHow to build machine learning apps.pdf
How to build machine learning apps.pdf
AnastasiaSteele10
 
How to build machine learning apps.pdf
How to build machine learning apps.pdfHow to build machine learning apps.pdf
How to build machine learning apps.pdf
StephenAmell4
 
Supervised Machine Learning Techniques common algorithms and its application
Supervised Machine Learning Techniques common algorithms and its applicationSupervised Machine Learning Techniques common algorithms and its application
Supervised Machine Learning Techniques common algorithms and its application
Tara ram Goyal
 
Big data, big opportunities
Big data, big opportunitiesBig data, big opportunities
Big data, big opportunities
Chouaieb NEMRI
 

Similar to Introduction to Data Science (20)

Introduction To Machine Learning
Introduction To Machine LearningIntroduction To Machine Learning
Introduction To Machine Learning
 
what-is-machine-learning-and-its-importance-in-todays-world.pdf
what-is-machine-learning-and-its-importance-in-todays-world.pdfwhat-is-machine-learning-and-its-importance-in-todays-world.pdf
what-is-machine-learning-and-its-importance-in-todays-world.pdf
 
INTERNSHIP ON MAcHINE LEARNING.pptx
INTERNSHIP ON MAcHINE LEARNING.pptxINTERNSHIP ON MAcHINE LEARNING.pptx
INTERNSHIP ON MAcHINE LEARNING.pptx
 
Machine learning applications nurturing growth of various business domains
Machine learning applications nurturing growth of various business domainsMachine learning applications nurturing growth of various business domains
Machine learning applications nurturing growth of various business domains
 
Machine Learning Ch 1.ppt
Machine Learning Ch 1.pptMachine Learning Ch 1.ppt
Machine Learning Ch 1.ppt
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
BIG DATA AND MACHINE LEARNING
BIG DATA AND MACHINE LEARNINGBIG DATA AND MACHINE LEARNING
BIG DATA AND MACHINE LEARNING
 
introduction to machine learning
introduction to machine learningintroduction to machine learning
introduction to machine learning
 
detailed Presentation on supervised learning
 detailed Presentation on supervised learning detailed Presentation on supervised learning
detailed Presentation on supervised learning
 
AI.pdf
AI.pdfAI.pdf
AI.pdf
 
machine_learning_section1_ebook.pdf
machine_learning_section1_ebook.pdfmachine_learning_section1_ebook.pdf
machine_learning_section1_ebook.pdf
 
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
 
Training_Report_on_Machine_Learning.docx
Training_Report_on_Machine_Learning.docxTraining_Report_on_Machine_Learning.docx
Training_Report_on_Machine_Learning.docx
 
How to build machine learning apps.pdf
How to build machine learning apps.pdfHow to build machine learning apps.pdf
How to build machine learning apps.pdf
 
How to build machine learning apps.pdf
How to build machine learning apps.pdfHow to build machine learning apps.pdf
How to build machine learning apps.pdf
 
How to build machine learning apps.pdf
How to build machine learning apps.pdfHow to build machine learning apps.pdf
How to build machine learning apps.pdf
 
How to build machine learning apps.pdf
How to build machine learning apps.pdfHow to build machine learning apps.pdf
How to build machine learning apps.pdf
 
How to build machine learning apps.pdf
How to build machine learning apps.pdfHow to build machine learning apps.pdf
How to build machine learning apps.pdf
 
Supervised Machine Learning Techniques common algorithms and its application
Supervised Machine Learning Techniques common algorithms and its applicationSupervised Machine Learning Techniques common algorithms and its application
Supervised Machine Learning Techniques common algorithms and its application
 
Big data, big opportunities
Big data, big opportunitiesBig data, big opportunities
Big data, big opportunities
 

Recently uploaded

Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
Vineet
 
06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus
Timothy Spann
 
Bangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts ServiceBangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts Service
nhero3888
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
ytypuem
 
Senior Software Profiles Backend Sample - Sheet1.pdf
Senior Software Profiles  Backend Sample - Sheet1.pdfSenior Software Profiles  Backend Sample - Sheet1.pdf
Senior Software Profiles Backend Sample - Sheet1.pdf
Vineet
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
Vietnam Cotton & Spinning Association
 
saps4hanaandsapanalyticswheretodowhat1565272000538.pdf
saps4hanaandsapanalyticswheretodowhat1565272000538.pdfsaps4hanaandsapanalyticswheretodowhat1565272000538.pdf
saps4hanaandsapanalyticswheretodowhat1565272000538.pdf
newdirectionconsulta
 
一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理
zsafxbf
 
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
Timothy Spann
 
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
Call Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call GirlCall Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call Girl
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
sapna sharmap11
 
Sample Devops SRE Product Companies .pdf
Sample Devops SRE  Product Companies .pdfSample Devops SRE  Product Companies .pdf
Sample Devops SRE Product Companies .pdf
Vineet
 
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
davidpietrzykowski1
 
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
Rebecca Bilbro
 
Drownings spike from May to August in children
Drownings spike from May to August in childrenDrownings spike from May to August in children
Drownings spike from May to August in children
Bisnar Chase Personal Injury Attorneys
 
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
Call Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call GirlCall Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call Girl
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
sapna sharmap11
 
Call Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance Payment
Call Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance PaymentCall Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance Payment
Call Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance Payment
prijesh mathew
 
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
osoyvvf
 
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdfOverview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
nhutnguyen355078
 
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
agdhot
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
blueshagoo1
 

Recently uploaded (20)

Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
 
06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus
 
Bangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts ServiceBangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts Service
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
 
Senior Software Profiles Backend Sample - Sheet1.pdf
Senior Software Profiles  Backend Sample - Sheet1.pdfSenior Software Profiles  Backend Sample - Sheet1.pdf
Senior Software Profiles Backend Sample - Sheet1.pdf
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
 
saps4hanaandsapanalyticswheretodowhat1565272000538.pdf
saps4hanaandsapanalyticswheretodowhat1565272000538.pdfsaps4hanaandsapanalyticswheretodowhat1565272000538.pdf
saps4hanaandsapanalyticswheretodowhat1565272000538.pdf
 
一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理
 
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
 
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
Call Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call GirlCall Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call Girl
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
 
Sample Devops SRE Product Companies .pdf
Sample Devops SRE  Product Companies .pdfSample Devops SRE  Product Companies .pdf
Sample Devops SRE Product Companies .pdf
 
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
 
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
 
Drownings spike from May to August in children
Drownings spike from May to August in childrenDrownings spike from May to August in children
Drownings spike from May to August in children
 
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
Call Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call GirlCall Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call Girl
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
 
Call Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance Payment
Call Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance PaymentCall Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance Payment
Call Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance Payment
 
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
 
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdfOverview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
 
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
 

Introduction to Data Science

  • 2.
  • 3.
  • 4.
  • 5. Challenges deep-dive Why the Hype Around Data Science? ● The demand for data scientists will soar by 28% by 2023 ● Data scientist roles have grown over 650% since 2012, but currently, 35,000 people in the US have data science skills, while hundreds of companies are hiring for those roles. ● Software engineering is a common starting point for professionals who are in the top five fasting growing jobs today. ● Data Science gives you career flexibility
  • 6. Who are Data Scientist?
  • 7.
  • 8. Challenges deep-dive What is Machine Learning ? Machine learning teaches computers to do what comes naturally to humans and animals: learn from experience. Machine learning algorithms use computational methods to “learn” information directly from data without relying on a predetermined equation as a model. The algorithms adaptively improve their performance as the number of samples available for learning increases.
  • 9. Challenges deep-dive A Definition A computer program is said to learn from experience E with respect to some task T and some performance measure P if its performance on T, as measured by P, improves with experience E. -Tom Mitchell
  • 10. Challenges deep-dive A Small Question Suppose we feed a learning algorithm a lot of historical weather data, and have it learn to predict weather. In this setting, what is T,P,E?
  • 11.
  • 13. Challenges deep-dive Machine learning teaches computers to do what comes naturally to humans and animals: learn from experience. Machine learning algorithms use computational methods to “learn” information directly from data without relying on a predetermined equation as a model. The algorithms adaptively improve their performance as the number of samples available for learning increases. Real World Applications With the rise in big data, machine learning has become particularly important for solving problems in areas like these: ● Image processing and computer vision,for face recognition, motion detection, and object detection ● Computational biology, for tumor detection, drug discovery, and DNA sequencing ● Energy production, for price and load forecasting ● Automotive, aerospace, and manufacturing, for predictive maintenance ● Natural language processing
  • 14. Challenges deep-dive How Machine Learning Works Machine learning uses two types of techniques: ● Supervised learning, which trains a model on known input and output data so that it can predict future outputs ● Unsupervised learning, which finds hidden patterns or intrinsic structures in input data.
  • 16. Challenges deep-dive Supervised Learning The aim of supervised machine learning is to build a model that makes predictions based on evidence in the presence of uncertainty. A supervised learning algorithm takes a known set of input data and known responses to the data (output) and trains a model to generate reasonable predictions for the response to new data
  • 17. Classification - predict discrete responses Classification models classify input data into categories.for example, whether an email is genuine or spam, or whether a tumor is cancerous or benign. Regression - predict continuous responses for example, changes in temperature or fluctuations in power demand. Typical applications include electricity load forecasting and algorithmic trading.
  • 18. Challenges deep-dive Unsupervised Learning Unsupervised learning finds hidden patterns or intrinsic structures in data. It is used to draw inferences from dataset consisting of input data without labeled responses.
  • 19. Clustering is the most common unsupervised learning technique. It is used for exploratory data analysis to find hidden patterns or groupings in data.Applications for clustering include gene sequence analysis,market research, and object recognition
  • 20. Knowledge Test Which of the following would you apply supervised learning to? 1. Given genetic (DNA) data from a person, predict the odds of him/her developing diabetes over the next 10 years. 2. Given a large dataset of medical records from patients suffering from heart disease, try to learn whether there might be different clusters of such patients for which we might tailor separate treatments. 3. Given data on how 1000 medical patients respond to an experimental drug (such as effectiveness of the treatment, side effects, etc.), discover whether there are different categories or "types" of patients in terms of how they respond to the drug, and if so what these categories are. 4. Have a computer examine an audio clip of a piece of music, and classify whether or not there are vocals (i.e., a human voice singing) in that audio clip, or if it is a clip of only musical instruments (and no vocals).
  • 21. Knowledge Test Which of the following questions can be answered using a classification algorithm? 1. How does the exchange rate depend on the GDP? 2. Does a document contain the handwritten letter S? 3. How can I group supermarket products using purchase frequency?
  • 22. Knowledge Test 1. Suppose you are working on weather prediction, and you would like to predict whether or not it will be raining at 5pm tomorrow. You want to use a learning algorithm for this.Would you treat this as a classification or a regression problem? 2. Suppose you are working on stock market prediction. You would like to predict whether or not a certain company will declare bankruptcy within the next 7 days (by training on data of similar companies that had previously been at risk of bankruptcy). Would you treat this as a classification or a regression problem?
  • 23. How Do You Decide Which Algorithm to Use?
  • 24. Choosing the right algorithm can seem overwhelming There are dozens of supervised and unsupervised machine learning algorithms, and each takes a different approach to learning.
  • 25. There is no best method or one size fits all. Finding the right algorithm is partly just trial and error But algorithm selection also depends on the size and type of data you’re working with, the insights you want to get from the data, and how those insights will be used.
  • 26. Two - Class Classification
  • 27. Multi - Class Classification
  • 31. Challenges deep-dive When should we use Machine Learning Consider using machine learning when you have a complex task or problem involving a large amount of data and lots of variables, but no existing formula or equation.
  • 32.
  • 33. Knowledge Test Have a look at the statements below and identify the one which is not a machine learning problem 1. Given a viewer's shopping habits, recommend a product to purchase the next time she visits your website. 2. Given the symptoms of a patient, identify her illness. 3. Predict the USD/EUR exchange rate for February 2023. 4. Compute the mean wage of 10 employees for your company.
  • 34. Knowledge Test Which of the following statements uses a machine learning model? 1. Determine whether an incoming email is spam or not 2. Obtain the name of last year's FIFIA Ballon d’Or champion 3. Automatically tagging your new Facebook photos 4. Select the student with the highest grade on a statistics course
  • 36. Challenges deep-dive There is NO Straight Line With machine learning there’s rarely a straight line from start to finish. You’ll find yourself constantly iterating and trying different ideas and approaches
  • 37. Challenges deep-dive Machine learning teaches computers to do what comes naturally to humans and animals: learn from experience. Machine learning algorithms use computational methods to “learn” information directly from data without relying on a predetermined equation as a model. The algorithms adaptively improve their performance as the number of samples available for learning increases. Machine Learning Challenges ● Data comes in all shapes and sizes ● Preprocessing your data might require specialized knowledge and tools ● It takes time to find the best model to fit the data.
  • 38. Challenges deep-dive Machine learning teaches computers to do what comes naturally to humans and animals: learn from experience. Machine learning algorithms use computational methods to “learn” information directly from data without relying on a predetermined equation as a model. The algorithms adaptively improve their performance as the number of samples available for learning increases. Questions to Ask Before Starting Every machine learning workflow begins with three questions: ● What kind of data are you working with? ● What insights do you want to get from it? ● How and where will those insights be applied? Your answers to these questions help you decide whether to use supervised or unsupervised learning.
  • 39. Challenges deep-dive Machine learning teaches computers to do what comes naturally to humans and animals: learn from experience. Machine learning algorithms use computational methods to “learn” information directly from data without relying on a predetermined equation as a model. The algorithms adaptively improve their performance as the number of samples available for learning increases. Data Science - Five Questions There are only five questions that data science answers: ● Is this A or B? ● Is this weird? ● How much – or – How many? ● How is this organized? ● What should I do next?
  • 40. Knowledge Test Which of the following questions can be answered using a classification algorithm? 1. How does the exchange rate depend on the GDP? 2. Does a document contain the handwritten letter S? 3. How can I group supermarket products using purchase frequency?
  • 41.
  • 42. Workflow at a Glance
  • 43. Challenges deep-dive Machine learning teaches computers to do what comes naturally to humans and animals: learn from experience. Machine learning algorithms use computational methods to “learn” information directly from data without relying on a predetermined equation as a model. The algorithms adaptively improve their performance as the number of samples available for learning increases. Step 1 - Load the Data We store the labeled data sets in a text file. A flat file format such as text or CSV is easy to work with and makes it straightforward to import data. Machine learning algorithms aren’t smart enough to tell the difference between noise and valuable information. Before using the data for training, we need to make sure it’s clean and complete
  • 44. Challenges deep-dive Machine learning teaches computers to do what comes naturally to humans and animals: learn from experience. Machine learning algorithms use computational methods to “learn” information directly from data without relying on a predetermined equation as a model. The algorithms adaptively improve their performance as the number of samples available for learning increases. Step 2 - Preprocess the Data To preprocess the data we do the following: ● Look for outliers–data points that lie outside the rest of the data ● Check for missing values ● Divide the data into two sets ○ We save part of the data for testing (the test set) and use the rest (the training set) to build models. This is referred to as holdout, and is a useful cross-validation technique
  • 45. Challenges deep-dive Machine learning teaches computers to do what comes naturally to humans and animals: learn from experience. Machine learning algorithms use computational methods to “learn” information directly from data without relying on a predetermined equation as a model. The algorithms adaptively improve their performance as the number of samples available for learning increases. Step 3 - Derive Features Deriving features (also known as feature engineering or feature extraction) turns raw data into information that a machine learning algorithm can use. Use feature selection to: • Improve the accuracy of a machine learning algorithm • Boost model performance for high-dimensional data sets • Improve model interpretability • Prevent overfitting
  • 46. Challenges deep-dive Machine learning teaches computers to do what comes naturally to humans and animals: learn from experience. Machine learning algorithms use computational methods to “learn” information directly from data without relying on a predetermined equation as a model. The algorithms adaptively improve their performance as the number of samples available for learning increases. Step 4 - Build and Train Model ● The predefined algorithms and the test data are used for building the model. ● The training data is used to train and evaluate the model
  • 47. Challenges deep-dive Machine learning teaches computers to do what comes naturally to humans and animals: learn from experience. Machine learning algorithms use computational methods to “learn” information directly from data without relying on a predetermined equation as a model. The algorithms adaptively improve their performance as the number of samples available for learning increases. Step 5 - Improve the Model Improving a model can take two different directions: make the model simpler or add complexity. Simplify - reduce the number of features Add Complexity - make it more fine-tuned
  • 48. Simplify Popular feature reduction techniques include: ● Correlation matrix – shows the relationship between variables, so that variables (or features) that are not highly correlated can be removed. ● Principal component analysis (PCA) - eliminates redundancy by finding a combination of features that captures key distinctions between the original features and brings out strong patterns in the dataset. ● Sequential feature reduction – reduces features iteratively on the model until there is no improvement in performance
  • 49. Add Complexity ● Use model combination – merge multiple simpler models into a larger model that is better able to represent the trends in the data than any of the simpler models could on their own. ● Add more data sources
  • 50. TO DO ● Getting Started ● Familiarize with Maths and Algorithms ● Select the Infrastructure or Tool ● Create your profile and participate in competition
  • 51. Christy Abraham Joy Email - christyabrahamjoy@gmail.com Mob - +91 94000 95273 Feel Free to Contact!