(Go: >> BACK << -|- >> HOME <<)

SlideShare a Scribd company logo
1© Cloudera, Inc. All rights reserved.
Beyond ETL: How to Build
Continuous Ingestion for IOT
Sean Anderson | Product Marketing, Cloudera
Kirit Basu | Director of Product Management,
2© Cloudera, Inc. All rights reserved.
The Internet of Things
I. Driving Data Growth
II. Real-time Capabilities
III. An IOT Data Platform
IV. Cloudera Enterprise for IOT
The IOT Use Case
I. Packaged Goods
II. Sensor Data
III. Real-time Processing
Streamsets Platform
I. Data Collector
II. Data KPI’s
III. Containerized Architecture
IV. Real-time Analytics with
3© Cloudera, Inc. All rights reserved.
Are you currently collecting sensor data?
• Yes
• No
• Plan to in the future
4© Cloudera, Inc. All rights reserved.
Internet of Things (IoT) – A Revolution In The Making
In Value
Annual Growth
30 Billion
Connected Vehicles
Source - IDC & Gartner Estimates
Internet of
IoT Markets - 2020
5© Cloudera, Inc. All rights reserved.
IoT Will Drive An Explosion of Data…
Data expected to explode to
44 ZB by 2020
Source: IDC
44 Trillion GB!80% of data will be
6© Cloudera, Inc. All rights reserved.
Value is Maximized when Data is combined from
other sources
Value of Data is multiplied when you combine
and correlate it with other data from relevant
Improvement in value that can be
unlocked by combining data from
multiple IoT applications and sources
SOURCE: McKinsey Global Institute analysis
“Interoperability would significantly improve performance by
combining sensor data from different machines and systems to provide
decision makers with an integrated view of performance across an
entire factory or oil rig.”
7© Cloudera, Inc. All rights reserved.
The IoT Ecosystem
IoT Gateway
Data Center
Data Analytics
Sensors/ Things
8© Cloudera, Inc. All rights reserved.
The IoT Ecosystem
IoT Gateway
Data Center
Data Analytics
Sensors/ Things
Data Characteristics
• Un-structured
• Intermittent
• Volume & Variety
• Data Routing
• Edge-Processing
• Edge-Storage
Sensors/ Things
•To grow by 50X
•Drop in prices by
70% in last 5 years
Data Storage, Processing & Analytics
IOT Data Characteristics
• More processing in the
• Analytics on the cloud
IOT Data Analytics
• Key to Value Creation
• Combine data from multiple
sources & types
• Drive business insights
IOT Data Characteristics
• Distributed Data
• Cloud & On-Premise
9© Cloudera, Inc. All rights reserved.
Key Attributes For Next Gen IoT Data Platform
Scale efficiently based on
your data growth
Effectively handle multiple
data-types and structures
Manage the complexity of
real-time IoT data ingest
Fundamentally Secure
Real-Time Analytics – Combine and
analyze data from multiple sources
Flexible deployment options
- Cloud & Distributed Data Processing
10© Cloudera, Inc. All rights reserved.
Cloudera Enterprise – Making Hadoop Fast, Easy, and Secure
Cloudera Manager
Cloudera Director
Cloudera Navigator
Encrypt and KeyTrustee
Kafka, Flume
Sentry, RecordService
Spark, Hive, Pig
11© Cloudera, Inc. All rights reserved.
Cloudera Enterprise – The Data & Analytics Platform for IoT
Sensors/ IoT
Data Sources
Internal Systems External Sources
BI Solutions Real-Time AppsSearch EDWDiscove
Data Center
Sensor/ IoT Data
IoT Gateway
• Data Storage
• Data Processing
• Machine Learning
• Real-time Analytics
Cloudera Manager
Cloudera Director
Cloudera Navigator
Encrypt and KeyTrustee
Kafka, Flume
Sentry, RecordService
Spark, Hive, Pig
12© Cloudera, Inc. All rights reserved.
Cloudera Enterprise – Real Time Analytics for IoT
BI Solutions Real-Time AppsSearch EDWDiscover Machine
Spark Streaming
Leadership in Spark
Integrated with EDH
Flexible Storage
Store any and all Data.
Kudu - Fast Analytics on
Fast Data
Real-Time Data
Data Security
Four pillars of security: Perimeter,
Access, Visibility, and Data
+ Record Service
Streaming Ingest
Kafka & Flume - Real-Time
Data Ingest for streaming,
high volume data
Sensor/ IoT Data Internal Systems External Sources
Centralized Mgmt.
Cloudera Manager for
centralized cluster
Manage Multiple Clusters – On
Premise or Cloud environment
- On Premise or Cloud
Cloudera Manager
Cloudera Director
Cloudera Navigator
Encrypt and KeyTrustee
Kafka, Flume
Sentry, RecordService
Spark, Hive, Pig
13© Cloudera, Inc. All rights reserved.
The Cloudera Difference
Powerful Cluster Ops
Trusted by the pros
Cloud & Hybrid deployment
Integrated with AWS & Azure
Expert Support
Dedicated prescriptive help, just a click away
Real-Time IoT Analytics
The most experience with Spark
The Fastest Analytic SQL
Lowest latency, best concurrency
Fast, Updateable Analytic Storage
High throughput, low latency, and updates
Easy to ManageFast for Business Security without Compromise
Enterprise Encryption
Protects everything transparently
Access Policy Enforcement
Full-stack row/column-based RBAC & dynamic masking
Automated Data Management
Full-stack audit, lineage, discovery, and lifecycle
Secure Operations
Separation of duties, log data redaction
14© Cloudera, Inc. All rights reserved.
Continuous Data Ingestion with Cloudera & StreamSets
StreamSets enables easy onboarding and effortless data ingest into all
components of CDH
Reliable, Scalable, Always-
on Data Ingest
IoT for Consumer Packaged Goods
Gathering Sensor Data from IoT Devices
Gathering Sensor Data from Freight Containers
The StreamSets Data Collector
Design Debug Execute
StreamSets Deployment Models
Where are you in your development effort for bringing IoT data into Hadoop?
• In Production
• Test and Development
• Planning (Already decided on the architecture)
• Not there yet (Need to decide on an architecture)
• Current Architecture doesn’t work, need a better way to do things
Challenges with IoT Data
• Multitude of Sensors
• Real-Time Streaming
• Multiple Firmware versions
• Bad data from damaged sensors
• Regulatory Constraints
• Data Quality
Demo – Evolving DataSets
23© Cloudera, Inc. All rights reserved.
Getting Started is Easy
Watch the
Beyond ETL
Download the
Data Collector
Contact Us to
start a POC
1 2 3
24© Cloudera, Inc. All rights reserved.
Thank You

More Related Content

What's hot

Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Cloudera, Inc.
Big data journey to the cloud maz chaudhri 5.30.18
Big data journey to the cloud   maz chaudhri 5.30.18Big data journey to the cloud   maz chaudhri 5.30.18
Big data journey to the cloud maz chaudhri 5.30.18
Cloudera, Inc.
End to End Streaming Architectures
End to End Streaming ArchitecturesEnd to End Streaming Architectures
End to End Streaming Architectures
Cloudera, Inc.
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache Impala
Cloudera, Inc.
Preparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity RenaissancePreparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity Renaissance
Cloudera, Inc.
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
Cloudera, Inc.
Secure Data - Why Encryption and Access Control are Game Changers
Secure Data - Why Encryption and Access Control are Game ChangersSecure Data - Why Encryption and Access Control are Game Changers
Secure Data - Why Encryption and Access Control are Game Changers
Cloudera, Inc.
Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the Enterprise
Cloudera, Inc.
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence

Cloudera, Inc.
The Big Picture: Learned Behaviors in Churn
The Big Picture: Learned Behaviors in ChurnThe Big Picture: Learned Behaviors in Churn
The Big Picture: Learned Behaviors in Churn
Cloudera, Inc.
Kudu Forrester Webinar
Kudu Forrester WebinarKudu Forrester Webinar
Kudu Forrester Webinar
Cloudera, Inc.
Big data journey to the cloud rohit pujari 5.30.18
Big data journey to the cloud   rohit pujari 5.30.18Big data journey to the cloud   rohit pujari 5.30.18
Big data journey to the cloud rohit pujari 5.30.18
Cloudera, Inc.
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
Cloudera, Inc.
Extreme Sports & Beyond: Exploring a new frontier in data with GoPro
Extreme Sports & Beyond: Exploring a new frontier in data with GoProExtreme Sports & Beyond: Exploring a new frontier in data with GoPro
Extreme Sports & Beyond: Exploring a new frontier in data with GoPro
Cloudera, Inc.
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
Cloudera, Inc.
Risk Management for Data: Secured and Governed
Risk Management for Data: Secured and GovernedRisk Management for Data: Secured and Governed
Risk Management for Data: Secured and Governed
Cloudera, Inc.
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice Hotels
Cloudera, Inc.
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldPart 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Cloudera, Inc.
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentals
Cloudera, Inc.
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
ArabNet ME

What's hot (20)

Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Big data journey to the cloud maz chaudhri 5.30.18
Big data journey to the cloud   maz chaudhri 5.30.18Big data journey to the cloud   maz chaudhri 5.30.18
Big data journey to the cloud maz chaudhri 5.30.18
End to End Streaming Architectures
End to End Streaming ArchitecturesEnd to End Streaming Architectures
End to End Streaming Architectures
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache Impala
Preparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity RenaissancePreparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity Renaissance
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
Secure Data - Why Encryption and Access Control are Game Changers
Secure Data - Why Encryption and Access Control are Game ChangersSecure Data - Why Encryption and Access Control are Game Changers
Secure Data - Why Encryption and Access Control are Game Changers
Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the Enterprise
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence

The Big Picture: Learned Behaviors in Churn
The Big Picture: Learned Behaviors in ChurnThe Big Picture: Learned Behaviors in Churn
The Big Picture: Learned Behaviors in Churn
Kudu Forrester Webinar
Kudu Forrester WebinarKudu Forrester Webinar
Kudu Forrester Webinar
Big data journey to the cloud rohit pujari 5.30.18
Big data journey to the cloud   rohit pujari 5.30.18Big data journey to the cloud   rohit pujari 5.30.18
Big data journey to the cloud rohit pujari 5.30.18
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
Extreme Sports & Beyond: Exploring a new frontier in data with GoPro
Extreme Sports & Beyond: Exploring a new frontier in data with GoProExtreme Sports & Beyond: Exploring a new frontier in data with GoPro
Extreme Sports & Beyond: Exploring a new frontier in data with GoPro
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
Risk Management for Data: Secured and Governed
Risk Management for Data: Secured and GovernedRisk Management for Data: Secured and Governed
Risk Management for Data: Secured and Governed
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice Hotels
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldPart 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentals
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...

Similar to How to Build Continuous Ingestion for the Internet of Things

Top 5 IoT Use Cases
Top 5 IoT Use CasesTop 5 IoT Use Cases
Top 5 IoT Use Cases
Cloudera, Inc.
IoT-Enabled Predictive Maintenance
IoT-Enabled Predictive MaintenanceIoT-Enabled Predictive Maintenance
IoT-Enabled Predictive Maintenance
Cloudera, Inc.
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
Cloudera, Inc.
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data AnalyticsHow to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Cloudera, Inc.
Delivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with SnowflakeDelivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with Snowflake
Kent Graziano
Hadoop and Manufacturing
Hadoop and ManufacturingHadoop and Manufacturing
Hadoop and Manufacturing
Cloudera, Inc.
AWS Big Data Solution Days
AWS Big Data Solution DaysAWS Big Data Solution Days
AWS Big Data Solution Days
Amazon Web Services
CL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and PlanningCL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and Planning
Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6
Azure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar PresentationAzure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar Presentation
Matthew W. Bowers
Get Started with Cloudera’s Cyber Solution
Get Started with Cloudera’s Cyber SolutionGet Started with Cloudera’s Cyber Solution
Get Started with Cloudera’s Cyber Solution
Cloudera, Inc.
Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solution
Cloudera, Inc.
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
Cloudera, Inc.
Intel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data SuccessIntel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data Success
Cloudera, Inc.
Build Big Data Enterprise Solutions Faster on Azure HDInsight
Build Big Data Enterprise Solutions Faster on Azure HDInsightBuild Big Data Enterprise Solutions Faster on Azure HDInsight
Build Big Data Enterprise Solutions Faster on Azure HDInsight
DataWorks Summit/Hadoop Summit
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptxTrack 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Amazon Web Services
Meetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline DevelopmentMeetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline Development
Timothy Spann
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023

Similar to How to Build Continuous Ingestion for the Internet of Things (20)

Top 5 IoT Use Cases
Top 5 IoT Use CasesTop 5 IoT Use Cases
Top 5 IoT Use Cases
IoT-Enabled Predictive Maintenance
IoT-Enabled Predictive MaintenanceIoT-Enabled Predictive Maintenance
IoT-Enabled Predictive Maintenance
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data AnalyticsHow to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Delivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with SnowflakeDelivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with Snowflake
Hadoop and Manufacturing
Hadoop and ManufacturingHadoop and Manufacturing
Hadoop and Manufacturing
AWS Big Data Solution Days
AWS Big Data Solution DaysAWS Big Data Solution Days
AWS Big Data Solution Days
CL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and PlanningCL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and Planning
Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6
Azure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar PresentationAzure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar Presentation
Get Started with Cloudera’s Cyber Solution
Get Started with Cloudera’s Cyber SolutionGet Started with Cloudera’s Cyber Solution
Get Started with Cloudera’s Cyber Solution
Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solution
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
Intel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data SuccessIntel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data Success
Build Big Data Enterprise Solutions Faster on Azure HDInsight
Build Big Data Enterprise Solutions Faster on Azure HDInsightBuild Big Data Enterprise Solutions Faster on Azure HDInsight
Build Big Data Enterprise Solutions Faster on Azure HDInsight
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptxTrack 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Meetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline DevelopmentMeetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline Development
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
Cloudera, Inc.

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX

Recently uploaded

A Step-by-Step Guide to Selecting the Right Automated Software Testing Tools.pdf
A Step-by-Step Guide to Selecting the Right Automated Software Testing Tools.pdfA Step-by-Step Guide to Selecting the Right Automated Software Testing Tools.pdf
A Step-by-Step Guide to Selecting the Right Automated Software Testing Tools.pdf
04. Ruby Operators Slides - Ruby Core Teaching
04. Ruby Operators Slides - Ruby Core Teaching04. Ruby Operators Slides - Ruby Core Teaching
04. Ruby Operators Slides - Ruby Core Teaching
BDRSuite - #1 Cost effective Data Backup and Recovery Solution
BDRSuite - #1 Cost effective Data Backup and Recovery SolutionBDRSuite - #1 Cost effective Data Backup and Recovery Solution
BDRSuite - #1 Cost effective Data Backup and Recovery Solution
Applitools Autonomous 2.0 Sneak Peek.pdf
Applitools Autonomous 2.0 Sneak Peek.pdfApplitools Autonomous 2.0 Sneak Peek.pdf
Applitools Autonomous 2.0 Sneak Peek.pdf
01. Ruby Introduction - Ruby Core Teaching
01. Ruby Introduction - Ruby Core Teaching01. Ruby Introduction - Ruby Core Teaching
01. Ruby Introduction - Ruby Core Teaching
B.Sc. Computer Science Department PPT 2024
B.Sc. Computer Science Department PPT 2024B.Sc. Computer Science Department PPT 2024
B.Sc. Computer Science Department PPT 2024
Amazon Music Spelling Correction - SIGIR 2024
Amazon Music Spelling Correction - SIGIR 2024Amazon Music Spelling Correction - SIGIR 2024
Amazon Music Spelling Correction - SIGIR 2024
Old Tools, New Tricks: Unleashing the Power of Time-Tested Testing Tools
Old Tools, New Tricks: Unleashing the Power of Time-Tested Testing ToolsOld Tools, New Tricks: Unleashing the Power of Time-Tested Testing Tools
Old Tools, New Tricks: Unleashing the Power of Time-Tested Testing Tools
Benjamin Bischoff
Unlocking the Future of Artificial Intelligence
Unlocking the Future of Artificial IntelligenceUnlocking the Future of Artificial Intelligence
Unlocking the Future of Artificial Intelligence
New York University degree Cert offer diploma Transcripta
New York University degree Cert offer diploma Transcripta New York University degree Cert offer diploma Transcripta
New York University degree Cert offer diploma Transcripta
Empowering Businesses with Intelligent Software Solutions - Grawlix
Empowering Businesses with Intelligent Software Solutions - GrawlixEmpowering Businesses with Intelligent Software Solutions - Grawlix
Empowering Businesses with Intelligent Software Solutions - Grawlix
Aarisha Shaikh
Website Analytics PPT - Webtrack 360.pdf
Website Analytics PPT - Webtrack 360.pdfWebsite Analytics PPT - Webtrack 360.pdf
Website Analytics PPT - Webtrack 360.pdf
06. Ruby Array & Hash - Ruby Core Teaching
06. Ruby Array & Hash - Ruby Core Teaching06. Ruby Array & Hash - Ruby Core Teaching
06. Ruby Array & Hash - Ruby Core Teaching
05. Ruby Control Structures - Ruby Core Teaching
05. Ruby Control Structures - Ruby Core Teaching05. Ruby Control Structures - Ruby Core Teaching
05. Ruby Control Structures - Ruby Core Teaching
UW Cert degree offer diploma
UW Cert degree offer diploma UW Cert degree offer diploma
UW Cert degree offer diploma
Crowd Strike\Windows Update Issue: Overview and Current Status
Crowd Strike\Windows Update Issue: Overview and Current StatusCrowd Strike\Windows Update Issue: Overview and Current Status
Crowd Strike\Windows Update Issue: Overview and Current Status
Mlm software - Binary, Board, Matrix, Monoline, Unilevel MLM Ecommerce or E-p...
Mlm software - Binary, Board, Matrix, Monoline, Unilevel MLM Ecommerce or E-p...Mlm software - Binary, Board, Matrix, Monoline, Unilevel MLM Ecommerce or E-p...
Mlm software - Binary, Board, Matrix, Monoline, Unilevel MLM Ecommerce or E-p...
LETSCMS Private Limited
vSAN_Tutorial_Presentation with important topics
vSAN_Tutorial_Presentation with important  topicsvSAN_Tutorial_Presentation with important  topics
vSAN_Tutorial_Presentation with important topics
How to Secure Your Kubernetes Software Supply Chain at Scale
How to Secure Your Kubernetes Software Supply Chain at ScaleHow to Secure Your Kubernetes Software Supply Chain at Scale
How to Secure Your Kubernetes Software Supply Chain at Scale
Why Laravel is the Best PHP Framework An Introduction.pdf
Why Laravel is the Best PHP Framework An Introduction.pdfWhy Laravel is the Best PHP Framework An Introduction.pdf
Why Laravel is the Best PHP Framework An Introduction.pdf
Grey Space Computing

Recently uploaded (20)

A Step-by-Step Guide to Selecting the Right Automated Software Testing Tools.pdf
A Step-by-Step Guide to Selecting the Right Automated Software Testing Tools.pdfA Step-by-Step Guide to Selecting the Right Automated Software Testing Tools.pdf
A Step-by-Step Guide to Selecting the Right Automated Software Testing Tools.pdf
04. Ruby Operators Slides - Ruby Core Teaching
04. Ruby Operators Slides - Ruby Core Teaching04. Ruby Operators Slides - Ruby Core Teaching
04. Ruby Operators Slides - Ruby Core Teaching
BDRSuite - #1 Cost effective Data Backup and Recovery Solution
BDRSuite - #1 Cost effective Data Backup and Recovery SolutionBDRSuite - #1 Cost effective Data Backup and Recovery Solution
BDRSuite - #1 Cost effective Data Backup and Recovery Solution
Applitools Autonomous 2.0 Sneak Peek.pdf
Applitools Autonomous 2.0 Sneak Peek.pdfApplitools Autonomous 2.0 Sneak Peek.pdf
Applitools Autonomous 2.0 Sneak Peek.pdf
01. Ruby Introduction - Ruby Core Teaching
01. Ruby Introduction - Ruby Core Teaching01. Ruby Introduction - Ruby Core Teaching
01. Ruby Introduction - Ruby Core Teaching
B.Sc. Computer Science Department PPT 2024
B.Sc. Computer Science Department PPT 2024B.Sc. Computer Science Department PPT 2024
B.Sc. Computer Science Department PPT 2024
Amazon Music Spelling Correction - SIGIR 2024
Amazon Music Spelling Correction - SIGIR 2024Amazon Music Spelling Correction - SIGIR 2024
Amazon Music Spelling Correction - SIGIR 2024
Old Tools, New Tricks: Unleashing the Power of Time-Tested Testing Tools
Old Tools, New Tricks: Unleashing the Power of Time-Tested Testing ToolsOld Tools, New Tricks: Unleashing the Power of Time-Tested Testing Tools
Old Tools, New Tricks: Unleashing the Power of Time-Tested Testing Tools
Unlocking the Future of Artificial Intelligence
Unlocking the Future of Artificial IntelligenceUnlocking the Future of Artificial Intelligence
Unlocking the Future of Artificial Intelligence
New York University degree Cert offer diploma Transcripta
New York University degree Cert offer diploma Transcripta New York University degree Cert offer diploma Transcripta
New York University degree Cert offer diploma Transcripta
Empowering Businesses with Intelligent Software Solutions - Grawlix
Empowering Businesses with Intelligent Software Solutions - GrawlixEmpowering Businesses with Intelligent Software Solutions - Grawlix
Empowering Businesses with Intelligent Software Solutions - Grawlix
Website Analytics PPT - Webtrack 360.pdf
Website Analytics PPT - Webtrack 360.pdfWebsite Analytics PPT - Webtrack 360.pdf
Website Analytics PPT - Webtrack 360.pdf
06. Ruby Array & Hash - Ruby Core Teaching
06. Ruby Array & Hash - Ruby Core Teaching06. Ruby Array & Hash - Ruby Core Teaching
06. Ruby Array & Hash - Ruby Core Teaching
05. Ruby Control Structures - Ruby Core Teaching
05. Ruby Control Structures - Ruby Core Teaching05. Ruby Control Structures - Ruby Core Teaching
05. Ruby Control Structures - Ruby Core Teaching
UW Cert degree offer diploma
UW Cert degree offer diploma UW Cert degree offer diploma
UW Cert degree offer diploma
Crowd Strike\Windows Update Issue: Overview and Current Status
Crowd Strike\Windows Update Issue: Overview and Current StatusCrowd Strike\Windows Update Issue: Overview and Current Status
Crowd Strike\Windows Update Issue: Overview and Current Status
Mlm software - Binary, Board, Matrix, Monoline, Unilevel MLM Ecommerce or E-p...
Mlm software - Binary, Board, Matrix, Monoline, Unilevel MLM Ecommerce or E-p...Mlm software - Binary, Board, Matrix, Monoline, Unilevel MLM Ecommerce or E-p...
Mlm software - Binary, Board, Matrix, Monoline, Unilevel MLM Ecommerce or E-p...
vSAN_Tutorial_Presentation with important topics
vSAN_Tutorial_Presentation with important  topicsvSAN_Tutorial_Presentation with important  topics
vSAN_Tutorial_Presentation with important topics
How to Secure Your Kubernetes Software Supply Chain at Scale
How to Secure Your Kubernetes Software Supply Chain at ScaleHow to Secure Your Kubernetes Software Supply Chain at Scale
How to Secure Your Kubernetes Software Supply Chain at Scale
Why Laravel is the Best PHP Framework An Introduction.pdf
Why Laravel is the Best PHP Framework An Introduction.pdfWhy Laravel is the Best PHP Framework An Introduction.pdf
Why Laravel is the Best PHP Framework An Introduction.pdf

How to Build Continuous Ingestion for the Internet of Things

  • 1. 1© Cloudera, Inc. All rights reserved. Beyond ETL: How to Build Continuous Ingestion for IOT Sean Anderson | Product Marketing, Cloudera Kirit Basu | Director of Product Management, Streamsets
  • 2. 2© Cloudera, Inc. All rights reserved. Agenda The Internet of Things I. Driving Data Growth II. Real-time Capabilities III. An IOT Data Platform IV. Cloudera Enterprise for IOT The IOT Use Case I. Packaged Goods II. Sensor Data III. Real-time Processing Streamsets Platform I. Data Collector II. Data KPI’s III. Containerized Architecture IV. Real-time Analytics with Cloudera Demo
  • 3. 3© Cloudera, Inc. All rights reserved. Poll Are you currently collecting sensor data? • Yes • No • Plan to in the future
  • 4. 4© Cloudera, Inc. All rights reserved. Internet of Things (IoT) – A Revolution In The Making $1.7 Trillion In Value 20% Annual Growth 30 Billion Things 250 Million Connected Vehicles Source - IDC & Gartner Estimates Internet of Things IoT Markets - 2020
  • 5. 5© Cloudera, Inc. All rights reserved. IoT Will Drive An Explosion of Data… Data expected to explode to 44 ZB by 2020 Source: IDC 44 Trillion GB!80% of data will be unstructured
  • 6. 6© Cloudera, Inc. All rights reserved. Value is Maximized when Data is combined from other sources Value of Data is multiplied when you combine and correlate it with other data from relevant sources Improvement in value that can be unlocked by combining data from multiple IoT applications and sources SOURCE: McKinsey Global Institute analysis “Interoperability would significantly improve performance by combining sensor data from different machines and systems to provide decision makers with an integrated view of performance across an entire factory or oil rig.” 40%
  • 7. 7© Cloudera, Inc. All rights reserved. The IoT Ecosystem Consumer Industrial IoT Gateway Cloud Data Center Data Analytics Sensors/ Things
  • 8. 8© Cloudera, Inc. All rights reserved. The IoT Ecosystem Consumer Industrial IoT Gateway Data Center Data Analytics Sensors/ Things Data Characteristics • Un-structured • Intermittent • Volume & Variety Gateway • Data Routing • Edge-Processing • Edge-Storage Sensors/ Things •To grow by 50X •Drop in prices by 70% in last 5 years Data Storage, Processing & Analytics IOT Data Characteristics • More processing in the cloud • Analytics on the cloud IOT Data Analytics • Key to Value Creation • Combine data from multiple sources & types • Drive business insights IOT Data Characteristics • Distributed Data Processing • Cloud & On-Premise Cloud
  • 9. 9© Cloudera, Inc. All rights reserved. Key Attributes For Next Gen IoT Data Platform Scale efficiently based on your data growth Effectively handle multiple data-types and structures Manage the complexity of real-time IoT data ingest Fundamentally Secure Real-Time Analytics – Combine and analyze data from multiple sources Flexible deployment options - Cloud & Distributed Data Processing
  • 10. 10© Cloudera, Inc. All rights reserved. FILESYSTEM RELATIONAL Cloudera Enterprise – Making Hadoop Fast, Easy, and Secure OPERATIONS Cloudera Manager Cloudera Director DATA MANAGEMENT Cloudera Navigator Encrypt and KeyTrustee Optimizer BATCH Sqoop REAL-TIME Kafka, Flume PROCESS, ANALYZE, SERVE UNIFIED SERVICES RESOURCE MANAGEMENT YARN SECURITY Sentry, RecordService FILESYSTEM HDFS RELATIONAL Kudu NoSQL HBase STORE INTEGRATE BATCH Spark, Hive, Pig MapReduce STREAM Spark SQL Impala SEARCH Solr SDK Partners CLOUDERA ENTERPRISE
  • 11. 11© Cloudera, Inc. All rights reserved. Cloudera Enterprise – The Data & Analytics Platform for IoT Sensors/ IoT Data Sources Internal Systems External Sources BI Solutions Real-Time AppsSearch EDWDiscove r Machine Learning Data Center Cloud Sensor/ IoT Data IoT Gateway • Data Storage • Data Processing • Machine Learning • Real-time Analytics OPERATIONS Cloudera Manager Cloudera Director DATA MANAGEMENT Cloudera Navigator Encrypt and KeyTrustee Optimizer BATCH Sqoop REAL-TIME Kafka, Flume PROCESS, ANALYZE, SERVE UNIFIED SERVICES RESOURCE MANAGEMENT YARN SECURITY Sentry, RecordService FILESYSTEM HDFS RELATIONAL Kudu NoSQL HBase STORE INTEGRATE BATCH Spark, Hive, Pig MapReduce STREAM Spark SQL Impala SEARCH Solr SDK Partners
  • 12. 12© Cloudera, Inc. All rights reserved. Cloudera Enterprise – Real Time Analytics for IoT BI Solutions Real-Time AppsSearch EDWDiscover Machine Learning Deployment Flexibility Spark Streaming Leadership in Spark Integrated with EDH Flexible Storage Store any and all Data. Kudu - Fast Analytics on Fast Data Real-Time Data Processing Data Security Four pillars of security: Perimeter, Access, Visibility, and Data + Record Service Streaming Ingest Kafka & Flume - Real-Time Data Ingest for streaming, high volume data Sensor/ IoT Data Internal Systems External Sources Centralized Mgmt. Cloudera Manager for centralized cluster management Manage Multiple Clusters – On Premise or Cloud environment - On Premise or Cloud OPERATIONS Cloudera Manager Cloudera Director DATA MANAGEMENT Cloudera Navigator Encrypt and KeyTrustee Optimizer BATCH Sqoop REAL-TIME Kafka, Flume PROCESS, ANALYZE, SERVE UNIFIED SERVICES RESOURCE MANAGEMENT YARN SECURITY Sentry, RecordService FILESYSTEM HDFS RELATIONAL Kudu NoSQL HBase STORE INTEGRATE BATCH Spark, Hive, Pig MapReduce STREAM Spark SQL Impala SEARCH Solr SDK Partners
  • 13. 13© Cloudera, Inc. All rights reserved. The Cloudera Difference Powerful Cluster Ops Trusted by the pros Cloud & Hybrid deployment Integrated with AWS & Azure Expert Support Dedicated prescriptive help, just a click away Real-Time IoT Analytics The most experience with Spark The Fastest Analytic SQL Lowest latency, best concurrency Fast, Updateable Analytic Storage High throughput, low latency, and updates Easy to ManageFast for Business Security without Compromise Enterprise Encryption Protects everything transparently Access Policy Enforcement Full-stack row/column-based RBAC & dynamic masking Automated Data Management Full-stack audit, lineage, discovery, and lifecycle Secure Operations Separation of duties, log data redaction
  • 14. 14© Cloudera, Inc. All rights reserved. Continuous Data Ingestion with Cloudera & StreamSets StreamSets enables easy onboarding and effortless data ingest into all components of CDH Reliable, Scalable, Always- on Data Ingest
  • 15. IoT for Consumer Packaged Goods
  • 16. Gathering Sensor Data from IoT Devices
  • 17. Gathering Sensor Data from Freight Containers
  • 18. The StreamSets Data Collector Design Debug Execute
  • 20. Poll Where are you in your development effort for bringing IoT data into Hadoop? • In Production • Test and Development • Planning (Already decided on the architecture) • Not there yet (Need to decide on an architecture) • Current Architecture doesn’t work, need a better way to do things
  • 21. Challenges with IoT Data • Multitude of Sensors • Real-Time Streaming • Multiple Firmware versions • Bad data from damaged sensors • Regulatory Constraints • Data Quality
  • 22. Demo – Evolving DataSets
  • 23. 23© Cloudera, Inc. All rights reserved. Getting Started is Easy Watch the Beyond ETL Series Download the Streamsets Data Collector Contact Us to start a POC 1 2 3
  • 24. 24© Cloudera, Inc. All rights reserved. Thank You