(Go: >> BACK << -|- >> HOME <<)

SlideShare a Scribd company logo
HDInsight Overview
February 2015
Agenda
 What is Big Data?
 What is Hadoop? What is HDInsight?
 Hadoop Ecosystem
 HDInsight Overview
 Working with HDInsight
 Loading Data
 Querying Data
 Setting up an Environment
 Q&A
What is Big Data?
 Data being collected in ever-escalating volumes, at increasingly high
velocities, and for a widening variety of unstructured formats.
 Describes any large body of digital information from the text in a Twitter
feed, to the sensor information from industrial equipment, to information
about customer browsing and purchases on an online catalog.
 Can be historical (meaning stored data) or real-time (meaning streamed
directly from the source).
What is Hadoop and HDInsight?
 Apache Hadoop is an open-source software framework for storing and
processing big data in a distributed fashion on large clusters of commodity
hardware. It accomplishes two tasks: massive data storage and faster
processing.
 HDInsight is Microsoft’s cloud based implementation of Hadoop. HDInsight
was architected to handle any amount of data, scaling from terabytes to
petabytes on demand, and allows users to scale up or down as needed.
Microsoft has partnered with Hortonworks to bring Hadoop to Windows.
Hadoop Names & Technologies
 Hadoop is composed of 3 core components:
 HDFS – Hadoop Distributed File System (can store all kinds of data without prior organization.
Java-based)
 MapReduce – software programming model for processing large sets of data in parallel
 YARN – Resource management framework for scheduling and handling resource requests
from distributed applications
 There are other Hadoop components that can be leveraged within HDInsight:
 Pig – Simpler scripting for MapReduce transformation. Uses language called PigLatin
 Hive – A SQL-like querying language that presents data in the form of tables
 Sqoop – ETL-like tool that moves data between Hadoop and relational databases
 Oozie – a Hadoop job scheduler
 Additional technologies included:
 Ambari, Avro, Hbase, Mahout, Storm, Zookeeper
HDInsight / Hadoop Ecosystem
Advantages of Hadoop in the Cloud
(HDInsight)
 State-of-the-art Hadoop components
 High availability and reliability of clusters
 Efficient and economical data storage with Azure Blob
storage, a Hadoop-compatible option
 Integration with other Azure services, including
Websites and SQL Database
 Low entry cost
Working with HDInsight
 To get started with HDInsight, you need an MSDN account and an Azure portal
 The main components are
 HDInsight cluster (can scale the number of nodes up or down as needed)
 Azure blob storage (data repository in Azure)
 Running queries and executing jobs can be done through the “Query Console” interface
through the Azure portal, or through Visual Studio
 To use HDInsight in Visual Studio, you need Azure SDK 2.5 for .NET ( VS 2013 | VS 2012 | VS 2015
Preview)
Loading Data to HDInsight
 There are many ways to upload data to Azure blob storage. Some of the
more common ones include:
 Visual Studio
 PowerShell Scripts
 Azure Storage Explorer
 CloudXplorer
 Azure Explorer
Querying Data in HDInsight
 The easiest way to query data is through Hive, which creates a structure
on the data and uses a SQL-like language called HiveQL.
 Hive creates a “Schema on read” when accessing the data, and no physical
table is actually created
 The queries are translated into MapReduce jobs
 Hive works best with more structured data
 For unstructured data use Pig
 Uses a scripting language called Pig Latin to execute MapReduce jobs
 An alternative to writing Java code
 Pig Latin statements follow the general flow of: Load – Transform – Dump or
store
Models to Consider when Approaching a Big Data
Solution
Case 1 – Iterative Exploration
Choose this model when:
 Handling data that you cannot process using existing systems, perhaps by performing complex
calculations and transformations that are beyond the capabilities of existing systems to complete
in a reasonable time.
 Collecting feedback from customers through email, web pages, or external sources such as social
media sites, then analyzing it to get a picture of customer sentiment for your products.
 Combining information with other data, such as demographic data that indicates population
density and characteristics in each city where your products are sold.
 Dumping data from your existing information systems into HDInsight so that you can work with it
without interrupting other business processes or risking corruption of the original data.
 Trying out new ideas and validating processes before implementing them within the live system.
Models to Consider when Approaching a Big Data
Solution
Case 2 – Data warehouse on demand
Choose this model when:
 Storing data in a way that allows you to minimize storage cost by taking advantage of cloud-
based storage systems, and minimizing runtime cost by initiating a cluster to perform processing
only when required.
 Exposing both the source data in raw form, and the results of queries executed over this data in
the familiar row and column format, to a wide range of data analysis tools.
 Storing schemas (or, to be precise, metadata) for tables that are populated by the queries you
execute, and partitioning the data in tables based on a clustered index so that each has a
separate metadata definition and can be handled separately.
 Creating views based on tables, and creating functions for use in both tables and queries.
 Consuming the results directly in business applications through interactive analytical tools such as
Excel, or in corporate reporting platforms such as SQL Server Reporting Services.
Models to Consider when Approaching a Big Data
Solution
Case 3 – ETL automation
Choose this model when:
 Extracting and transforming data before you load it into your existing databases or
analytical tools.
 Performing categorization and restructuring of data, and for extracting summary
results to remove duplication and redundancy.
 Preparing data so that it is in the appropriate format and has appropriate content
to power other applications or services.
Models to Consider when Approaching a Big Data
Solution
Case 4 – BI Integration
Choose this model when:
 You have an existing enterprise data warehouse and BI system that you want to
augment with data from outside your organization.
 You want to explore new ways to combine data in order to provide better insight
into history and to predict future trends.
 You want to give users more opportunities for self-service reporting and analysis
that combines managed business data and big data from other sources.
Overview of the Big Data Process
Note that, in many ways, data analysis is an iterative process; and you
should take this approach when building a big data batch processing
solution.
Given the large volumes of data and correspondingly long processing
times typically involved in big data analysis, it can be useful to start by
implementing a proof of concept iteration in which a small subset of
the source data is used to validate the processing steps and results
before proceeding with a full analysis.
This enables you to test your big data processing design on a small
cluster, or even on a single-node on-premises cluster, before scaling out
to accommodate production level data volumes.
The important point is that, irrespective of how you choose to use Big Data, the end result is the same:
Some kind of analysis of the source data and meaningful visualization of the results.
References
 http://azure.microsoft.com/en-us/documentation/articles/hdinsight-
hadoop-introduction/
 https://msdn.microsoft.com/en-us/library/dn749858.aspx
 https://msdn.microsoft.com/en-us/library/dn749816.aspx
 http://social.technet.microsoft.com/wiki/contents/articles/13820.introductio
n-to-azure-hdinsight.aspx

More Related Content

What's hot

Big data on Azure for Architects
Big data on Azure for ArchitectsBig data on Azure for Architects
Big data on Azure for Architects
Tomasz Kopacz
 
Big Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL ServerBig Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL Server
Mark Kromer
 
Hadoop vs. RDBMS for Advanced Analytics
Hadoop vs. RDBMS for Advanced AnalyticsHadoop vs. RDBMS for Advanced Analytics
Hadoop vs. RDBMS for Advanced Analytics
joshwills
 
What is hadoop
What is hadoopWhat is hadoop
What is hadoop
Asis Mohanty
 
Big Data with Azure
Big Data with AzureBig Data with Azure
Big Data with Azure
Aaron (Ari) Bornstein
 
Big Data on azure
Big Data on azureBig Data on azure
Big Data on azure
David Giard
 
Big Data on the Microsoft Platform
Big Data on the Microsoft PlatformBig Data on the Microsoft Platform
Big Data on the Microsoft Platform
Andrew Brust
 
Big Data in Azure
Big Data in AzureBig Data in Azure
Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1
Thanh Nguyen
 
Big Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with PentahoBig Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with Pentaho
Mark Kromer
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
Dataflair Web Services Pvt Ltd
 
Microsoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMicrosoft Azure Big Data Analytics
Microsoft Azure Big Data Analytics
Mark Kromer
 
Data lake – On Premise VS Cloud
Data lake – On Premise VS CloudData lake – On Premise VS Cloud
Data lake – On Premise VS Cloud
Idan Tohami
 
Data lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiryData lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiry
datastack
 
Hw09 Welcome To Hadoop World
Hw09   Welcome To Hadoop WorldHw09   Welcome To Hadoop World
Hw09 Welcome To Hadoop World
Cloudera, Inc.
 
Azure Databricks—Apache Spark as a Service with Sascha Dittmann
Azure Databricks—Apache Spark as a Service with Sascha DittmannAzure Databricks—Apache Spark as a Service with Sascha Dittmann
Azure Databricks—Apache Spark as a Service with Sascha Dittmann
Databricks
 
Hadoop data access layer v4.0
Hadoop data access layer v4.0Hadoop data access layer v4.0
Hadoop data access layer v4.0
SpringPeople
 
Interactive query in hadoop
Interactive query in hadoopInteractive query in hadoop
Interactive query in hadoop
Rommel Garcia
 
Interactive query using hadoop
Interactive query using hadoopInteractive query using hadoop
Interactive query using hadoop
Arvind Radhakrishnen
 
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...
Agile Testing Alliance
 

What's hot (20)

Big data on Azure for Architects
Big data on Azure for ArchitectsBig data on Azure for Architects
Big data on Azure for Architects
 
Big Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL ServerBig Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL Server
 
Hadoop vs. RDBMS for Advanced Analytics
Hadoop vs. RDBMS for Advanced AnalyticsHadoop vs. RDBMS for Advanced Analytics
Hadoop vs. RDBMS for Advanced Analytics
 
What is hadoop
What is hadoopWhat is hadoop
What is hadoop
 
Big Data with Azure
Big Data with AzureBig Data with Azure
Big Data with Azure
 
Big Data on azure
Big Data on azureBig Data on azure
Big Data on azure
 
Big Data on the Microsoft Platform
Big Data on the Microsoft PlatformBig Data on the Microsoft Platform
Big Data on the Microsoft Platform
 
Big Data in Azure
Big Data in AzureBig Data in Azure
Big Data in Azure
 
Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1
 
Big Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with PentahoBig Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with Pentaho
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
 
Microsoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMicrosoft Azure Big Data Analytics
Microsoft Azure Big Data Analytics
 
Data lake – On Premise VS Cloud
Data lake – On Premise VS CloudData lake – On Premise VS Cloud
Data lake – On Premise VS Cloud
 
Data lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiryData lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiry
 
Hw09 Welcome To Hadoop World
Hw09   Welcome To Hadoop WorldHw09   Welcome To Hadoop World
Hw09 Welcome To Hadoop World
 
Azure Databricks—Apache Spark as a Service with Sascha Dittmann
Azure Databricks—Apache Spark as a Service with Sascha DittmannAzure Databricks—Apache Spark as a Service with Sascha Dittmann
Azure Databricks—Apache Spark as a Service with Sascha Dittmann
 
Hadoop data access layer v4.0
Hadoop data access layer v4.0Hadoop data access layer v4.0
Hadoop data access layer v4.0
 
Interactive query in hadoop
Interactive query in hadoopInteractive query in hadoop
Interactive query in hadoop
 
Interactive query using hadoop
Interactive query using hadoopInteractive query using hadoop
Interactive query using hadoop
 
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...
 

Similar to Hd insight overview

Traditional data word
Traditional data wordTraditional data word
Traditional data word
orcoxsm
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
Rajesh Jayarman
 
Hadoop and Big Data Analytics | Sysfore
Hadoop and Big Data Analytics | SysforeHadoop and Big Data Analytics | Sysfore
Hadoop and Big Data Analytics | Sysfore
Sysfore Technologies
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
Stephen Alex
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
Stephen Alex
 
ds_Pivotal_Big_Data_Suite_Product_Suite
ds_Pivotal_Big_Data_Suite_Product_Suiteds_Pivotal_Big_Data_Suite_Product_Suite
ds_Pivotal_Big_Data_Suite_Product_Suite
Robin Fong 方俊强
 
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
Jane Roberts
 
Big data and apache hadoop adoption
Big data and apache hadoop adoptionBig data and apache hadoop adoption
Big data and apache hadoop adoption
faizrashid1995
 
Infrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical WorkloadsInfrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical Workloads
Cognizant
 
Haddop in Business Intelligence
Haddop in Business IntelligenceHaddop in Business Intelligence
Haddop in Business Intelligence
HGanesh
 
Hadoop
HadoopHadoop
Big Data
Big DataBig Data
Big Data
Kirubaburi R
 
Building a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperBuilding a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White Paper
Impetus Technologies
 
Lecture4 big data technology foundations
Lecture4 big data technology foundationsLecture4 big data technology foundations
Lecture4 big data technology foundations
hktripathy
 
Big data Question bank.pdf
Big data Question bank.pdfBig data Question bank.pdf
Big data Question bank.pdf
Sitamarhi Institute of Technology
 
Big Data & Hadoop
Big Data & HadoopBig Data & Hadoop
Big Data & Hadoop
Krishna Sujeer
 
Hadoop data-lake-white-paper
Hadoop data-lake-white-paperHadoop data-lake-white-paper
Hadoop data-lake-white-paper
Supratim Ray
 
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
Moacyr Passador
 
The Big Picture on Big Data and Cognos
The Big Picture on Big Data and CognosThe Big Picture on Big Data and Cognos
The Big Picture on Big Data and Cognos
Senturus
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data Analytics
Attunity
 

Similar to Hd insight overview (20)

Traditional data word
Traditional data wordTraditional data word
Traditional data word
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
 
Hadoop and Big Data Analytics | Sysfore
Hadoop and Big Data Analytics | SysforeHadoop and Big Data Analytics | Sysfore
Hadoop and Big Data Analytics | Sysfore
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
ds_Pivotal_Big_Data_Suite_Product_Suite
ds_Pivotal_Big_Data_Suite_Product_Suiteds_Pivotal_Big_Data_Suite_Product_Suite
ds_Pivotal_Big_Data_Suite_Product_Suite
 
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
 
Big data and apache hadoop adoption
Big data and apache hadoop adoptionBig data and apache hadoop adoption
Big data and apache hadoop adoption
 
Infrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical WorkloadsInfrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical Workloads
 
Haddop in Business Intelligence
Haddop in Business IntelligenceHaddop in Business Intelligence
Haddop in Business Intelligence
 
Hadoop
HadoopHadoop
Hadoop
 
Big Data
Big DataBig Data
Big Data
 
Building a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperBuilding a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White Paper
 
Lecture4 big data technology foundations
Lecture4 big data technology foundationsLecture4 big data technology foundations
Lecture4 big data technology foundations
 
Big data Question bank.pdf
Big data Question bank.pdfBig data Question bank.pdf
Big data Question bank.pdf
 
Big Data & Hadoop
Big Data & HadoopBig Data & Hadoop
Big Data & Hadoop
 
Hadoop data-lake-white-paper
Hadoop data-lake-white-paperHadoop data-lake-white-paper
Hadoop data-lake-white-paper
 
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
 
The Big Picture on Big Data and Cognos
The Big Picture on Big Data and CognosThe Big Picture on Big Data and Cognos
The Big Picture on Big Data and Cognos
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data Analytics
 

Recently uploaded

➒➌➎➏➑➐➋➑➐➐ Satta matka Dpboss kalyan Result Indian Matka matka boss otg Dpbos...
➒➌➎➏➑➐➋➑➐➐ Satta matka Dpboss kalyan Result Indian Matka matka boss otg Dpbos...➒➌➎➏➑➐➋➑➐➐ Satta matka Dpboss kalyan Result Indian Matka matka boss otg Dpbos...
➒➌➎➏➑➐➋➑➐➐ Satta matka Dpboss kalyan Result Indian Matka matka boss otg Dpbos...
➒➌➎➏➑➐➋➑➐➐Dpboss Matka Guessing Satta Matka Kalyan Chart Indian Matka
 
Satta matka game,matka result,Fastest matka satka batta,matka 420,Matka boss,...
Satta matka game,matka result,Fastest matka satka batta,matka 420,Matka boss,...Satta matka game,matka result,Fastest matka satka batta,matka 420,Matka boss,...
Satta matka game,matka result,Fastest matka satka batta,matka 420,Matka boss,...
DP Boss Satta Matka Kalyan Matka
 
Satta batta Matka Guessing Satta Matta Matka Indian Matka
Satta batta Matka Guessing Satta Matta Matka Indian MatkaSatta batta Matka Guessing Satta Matta Matka Indian Matka
Fix fix fix satta number matka boss otg satta matka
Fix fix fix satta number matka boss otg satta matkaFix fix fix satta number matka boss otg satta matka
Indian Matka Dpboss Matka guessing matka boss otg Satta matka
Indian Matka Dpboss Matka guessing matka boss otg Satta matkaIndian Matka Dpboss Matka guessing matka boss otg Satta matka
Indian Matka Dpboss Matka guessing matka boss otg Satta matka
➒➌➎➏➑➐➋➑➐➐Dpboss Matka Guessing Satta Matka Kalyan Chart Indian Matka
 
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan ChartSatta Matka Dpboss Kalyan Matka Results Kalyan Chart
DPBOSS GUESSING KALYAN SATTA MATKA KALYAN CHAT
DPBOSS GUESSING KALYAN SATTA MATKA KALYAN CHATDPBOSS GUESSING KALYAN SATTA MATKA KALYAN CHAT
Matka guessing satta Matta matka Dpboss Matka boss otg
Matka guessing satta Matta matka Dpboss  Matka boss otgMatka guessing satta Matta matka Dpboss  Matka boss otg
Matka boss otg Satta Matta Matka Indian Matka Dpboss Matka Guessing
Matka boss otg Satta Matta Matka Indian Matka Dpboss Matka GuessingMatka boss otg Satta Matta Matka Indian Matka Dpboss Matka Guessing
Matka boss otg Satta Matta Matka Indian Matka Dpboss Matka Guessing
➒➌➎➏➑➐➋➑➐➐Dpboss Matka Guessing Satta Matka Kalyan Chart Indian Matka
 
Satta Matka India Matka Satta Kalyan Chart
Satta Matka India Matka Satta Kalyan ChartSatta Matka India Matka Satta Kalyan Chart
Satta Matka India Matka Satta Kalyan Chart
India Matka
 
Satta matka Dpboss Matka guessing satta Matta matka Indian Matka kalyan matka
Satta matka Dpboss Matka guessing satta Matta matka Indian Matka kalyan matkaSatta matka Dpboss Matka guessing satta Matta matka Indian Matka kalyan matka
Satta matka Dpboss Matka guessing satta Matta matka Indian Matka kalyan matka
➒➌➎➏➑➐➋➑➐➐Dpboss Matka Guessing Satta Matka Kalyan Chart Indian Matka
 
nika_myrthil_EB_fb1_2024-06.pptxpj......
nika_myrthil_EB_fb1_2024-06.pptxpj......nika_myrthil_EB_fb1_2024-06.pptxpj......
nika_myrthil_EB_fb1_2024-06.pptxpj......
NikaMyrthil
 
Satta Matka, Kalyan Night Chart ,Dpbosss
Satta Matka, Kalyan Night Chart ,DpbosssSatta Matka, Kalyan Night Chart ,Dpbosss
Satta Matka, Kalyan Night Chart ,Dpbosss
Matka Guessing ❼ʘ❷ʘ❻❻➃➆➆➀ Matka Result
 
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan ChartSatta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matta Matka 143 Matka Boss DP boss
Satta Matta Matka 143 Matka Boss DP bossSatta Matta Matka 143 Matka Boss DP boss
Satta Matta Matka 143 Matka Boss DP boss
Matka Guessing ❼ʘ❷ʘ❻❻➃➆➆➀ Matka Result
 
Matka boss otg satta matka kalyan matka Dpboss Matka guessing Indian Matka
Matka boss otg satta matka kalyan matka Dpboss Matka guessing Indian MatkaMatka boss otg satta matka kalyan matka Dpboss Matka guessing Indian Matka
Matka boss otg satta matka kalyan matka Dpboss Matka guessing Indian Matka
➒➌➎➏➑➐➋➑➐➐Dpboss Matka Guessing Satta Matka Kalyan Chart Indian Matka
 
Indian Matka Dpboss Matka Guessing Tara Matka boss otg
Indian Matka Dpboss Matka Guessing Tara Matka boss otgIndian Matka Dpboss Matka Guessing Tara Matka boss otg
Kalyan Panel Chart | 9037164122 | kalyanchart.net
Kalyan Panel Chart | 9037164122 | kalyanchart.netKalyan Panel Chart | 9037164122 | kalyanchart.net
Kalyan Panel Chart | 9037164122 | kalyanchart.net
praveenkpatgar
 
Matka boss otg satta Matta matka Indian Matka Tara Matka
Matka boss otg satta Matta matka Indian Matka Tara MatkaMatka boss otg satta Matta matka Indian Matka Tara Matka
➒➌➎➏➑➐➋➑➐➐ Satta matka Dpboss kalyan Result Indian Matka matka boss otg
➒➌➎➏➑➐➋➑➐➐ Satta matka Dpboss kalyan Result Indian Matka matka boss otg➒➌➎➏➑➐➋➑➐➐ Satta matka Dpboss kalyan Result Indian Matka matka boss otg
➒➌➎➏➑➐➋➑➐➐ Satta matka Dpboss kalyan Result Indian Matka matka boss otg
➒➌➎➏➑➐➋➑➐➐Dpboss Matka Guessing Satta Matka Kalyan Chart Indian Matka
 

Recently uploaded (20)

➒➌➎➏➑➐➋➑➐➐ Satta matka Dpboss kalyan Result Indian Matka matka boss otg Dpbos...
➒➌➎➏➑➐➋➑➐➐ Satta matka Dpboss kalyan Result Indian Matka matka boss otg Dpbos...➒➌➎➏➑➐➋➑➐➐ Satta matka Dpboss kalyan Result Indian Matka matka boss otg Dpbos...
➒➌➎➏➑➐➋➑➐➐ Satta matka Dpboss kalyan Result Indian Matka matka boss otg Dpbos...
 
Satta matka game,matka result,Fastest matka satka batta,matka 420,Matka boss,...
Satta matka game,matka result,Fastest matka satka batta,matka 420,Matka boss,...Satta matka game,matka result,Fastest matka satka batta,matka 420,Matka boss,...
Satta matka game,matka result,Fastest matka satka batta,matka 420,Matka boss,...
 
Satta batta Matka Guessing Satta Matta Matka Indian Matka
Satta batta Matka Guessing Satta Matta Matka Indian MatkaSatta batta Matka Guessing Satta Matta Matka Indian Matka
Satta batta Matka Guessing Satta Matta Matka Indian Matka
 
Fix fix fix satta number matka boss otg satta matka
Fix fix fix satta number matka boss otg satta matkaFix fix fix satta number matka boss otg satta matka
Fix fix fix satta number matka boss otg satta matka
 
Indian Matka Dpboss Matka guessing matka boss otg Satta matka
Indian Matka Dpboss Matka guessing matka boss otg Satta matkaIndian Matka Dpboss Matka guessing matka boss otg Satta matka
Indian Matka Dpboss Matka guessing matka boss otg Satta matka
 
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan ChartSatta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
 
DPBOSS GUESSING KALYAN SATTA MATKA KALYAN CHAT
DPBOSS GUESSING KALYAN SATTA MATKA KALYAN CHATDPBOSS GUESSING KALYAN SATTA MATKA KALYAN CHAT
DPBOSS GUESSING KALYAN SATTA MATKA KALYAN CHAT
 
Matka guessing satta Matta matka Dpboss Matka boss otg
Matka guessing satta Matta matka Dpboss  Matka boss otgMatka guessing satta Matta matka Dpboss  Matka boss otg
Matka guessing satta Matta matka Dpboss Matka boss otg
 
Matka boss otg Satta Matta Matka Indian Matka Dpboss Matka Guessing
Matka boss otg Satta Matta Matka Indian Matka Dpboss Matka GuessingMatka boss otg Satta Matta Matka Indian Matka Dpboss Matka Guessing
Matka boss otg Satta Matta Matka Indian Matka Dpboss Matka Guessing
 
Satta Matka India Matka Satta Kalyan Chart
Satta Matka India Matka Satta Kalyan ChartSatta Matka India Matka Satta Kalyan Chart
Satta Matka India Matka Satta Kalyan Chart
 
Satta matka Dpboss Matka guessing satta Matta matka Indian Matka kalyan matka
Satta matka Dpboss Matka guessing satta Matta matka Indian Matka kalyan matkaSatta matka Dpboss Matka guessing satta Matta matka Indian Matka kalyan matka
Satta matka Dpboss Matka guessing satta Matta matka Indian Matka kalyan matka
 
nika_myrthil_EB_fb1_2024-06.pptxpj......
nika_myrthil_EB_fb1_2024-06.pptxpj......nika_myrthil_EB_fb1_2024-06.pptxpj......
nika_myrthil_EB_fb1_2024-06.pptxpj......
 
Satta Matka, Kalyan Night Chart ,Dpbosss
Satta Matka, Kalyan Night Chart ,DpbosssSatta Matka, Kalyan Night Chart ,Dpbosss
Satta Matka, Kalyan Night Chart ,Dpbosss
 
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan ChartSatta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
 
Satta Matta Matka 143 Matka Boss DP boss
Satta Matta Matka 143 Matka Boss DP bossSatta Matta Matka 143 Matka Boss DP boss
Satta Matta Matka 143 Matka Boss DP boss
 
Matka boss otg satta matka kalyan matka Dpboss Matka guessing Indian Matka
Matka boss otg satta matka kalyan matka Dpboss Matka guessing Indian MatkaMatka boss otg satta matka kalyan matka Dpboss Matka guessing Indian Matka
Matka boss otg satta matka kalyan matka Dpboss Matka guessing Indian Matka
 
Indian Matka Dpboss Matka Guessing Tara Matka boss otg
Indian Matka Dpboss Matka Guessing Tara Matka boss otgIndian Matka Dpboss Matka Guessing Tara Matka boss otg
Indian Matka Dpboss Matka Guessing Tara Matka boss otg
 
Kalyan Panel Chart | 9037164122 | kalyanchart.net
Kalyan Panel Chart | 9037164122 | kalyanchart.netKalyan Panel Chart | 9037164122 | kalyanchart.net
Kalyan Panel Chart | 9037164122 | kalyanchart.net
 
Matka boss otg satta Matta matka Indian Matka Tara Matka
Matka boss otg satta Matta matka Indian Matka Tara MatkaMatka boss otg satta Matta matka Indian Matka Tara Matka
Matka boss otg satta Matta matka Indian Matka Tara Matka
 
➒➌➎➏➑➐➋➑➐➐ Satta matka Dpboss kalyan Result Indian Matka matka boss otg
➒➌➎➏➑➐➋➑➐➐ Satta matka Dpboss kalyan Result Indian Matka matka boss otg➒➌➎➏➑➐➋➑➐➐ Satta matka Dpboss kalyan Result Indian Matka matka boss otg
➒➌➎➏➑➐➋➑➐➐ Satta matka Dpboss kalyan Result Indian Matka matka boss otg
 

Hd insight overview

  • 2. Agenda  What is Big Data?  What is Hadoop? What is HDInsight?  Hadoop Ecosystem  HDInsight Overview  Working with HDInsight  Loading Data  Querying Data  Setting up an Environment  Q&A
  • 3. What is Big Data?  Data being collected in ever-escalating volumes, at increasingly high velocities, and for a widening variety of unstructured formats.  Describes any large body of digital information from the text in a Twitter feed, to the sensor information from industrial equipment, to information about customer browsing and purchases on an online catalog.  Can be historical (meaning stored data) or real-time (meaning streamed directly from the source).
  • 4. What is Hadoop and HDInsight?  Apache Hadoop is an open-source software framework for storing and processing big data in a distributed fashion on large clusters of commodity hardware. It accomplishes two tasks: massive data storage and faster processing.  HDInsight is Microsoft’s cloud based implementation of Hadoop. HDInsight was architected to handle any amount of data, scaling from terabytes to petabytes on demand, and allows users to scale up or down as needed. Microsoft has partnered with Hortonworks to bring Hadoop to Windows.
  • 5. Hadoop Names & Technologies  Hadoop is composed of 3 core components:  HDFS – Hadoop Distributed File System (can store all kinds of data without prior organization. Java-based)  MapReduce – software programming model for processing large sets of data in parallel  YARN – Resource management framework for scheduling and handling resource requests from distributed applications  There are other Hadoop components that can be leveraged within HDInsight:  Pig – Simpler scripting for MapReduce transformation. Uses language called PigLatin  Hive – A SQL-like querying language that presents data in the form of tables  Sqoop – ETL-like tool that moves data between Hadoop and relational databases  Oozie – a Hadoop job scheduler  Additional technologies included:  Ambari, Avro, Hbase, Mahout, Storm, Zookeeper
  • 6. HDInsight / Hadoop Ecosystem
  • 7. Advantages of Hadoop in the Cloud (HDInsight)  State-of-the-art Hadoop components  High availability and reliability of clusters  Efficient and economical data storage with Azure Blob storage, a Hadoop-compatible option  Integration with other Azure services, including Websites and SQL Database  Low entry cost
  • 8. Working with HDInsight  To get started with HDInsight, you need an MSDN account and an Azure portal  The main components are  HDInsight cluster (can scale the number of nodes up or down as needed)  Azure blob storage (data repository in Azure)  Running queries and executing jobs can be done through the “Query Console” interface through the Azure portal, or through Visual Studio  To use HDInsight in Visual Studio, you need Azure SDK 2.5 for .NET ( VS 2013 | VS 2012 | VS 2015 Preview)
  • 9. Loading Data to HDInsight  There are many ways to upload data to Azure blob storage. Some of the more common ones include:  Visual Studio  PowerShell Scripts  Azure Storage Explorer  CloudXplorer  Azure Explorer
  • 10. Querying Data in HDInsight  The easiest way to query data is through Hive, which creates a structure on the data and uses a SQL-like language called HiveQL.  Hive creates a “Schema on read” when accessing the data, and no physical table is actually created  The queries are translated into MapReduce jobs  Hive works best with more structured data  For unstructured data use Pig  Uses a scripting language called Pig Latin to execute MapReduce jobs  An alternative to writing Java code  Pig Latin statements follow the general flow of: Load – Transform – Dump or store
  • 11. Models to Consider when Approaching a Big Data Solution Case 1 – Iterative Exploration Choose this model when:  Handling data that you cannot process using existing systems, perhaps by performing complex calculations and transformations that are beyond the capabilities of existing systems to complete in a reasonable time.  Collecting feedback from customers through email, web pages, or external sources such as social media sites, then analyzing it to get a picture of customer sentiment for your products.  Combining information with other data, such as demographic data that indicates population density and characteristics in each city where your products are sold.  Dumping data from your existing information systems into HDInsight so that you can work with it without interrupting other business processes or risking corruption of the original data.  Trying out new ideas and validating processes before implementing them within the live system.
  • 12. Models to Consider when Approaching a Big Data Solution Case 2 – Data warehouse on demand Choose this model when:  Storing data in a way that allows you to minimize storage cost by taking advantage of cloud- based storage systems, and minimizing runtime cost by initiating a cluster to perform processing only when required.  Exposing both the source data in raw form, and the results of queries executed over this data in the familiar row and column format, to a wide range of data analysis tools.  Storing schemas (or, to be precise, metadata) for tables that are populated by the queries you execute, and partitioning the data in tables based on a clustered index so that each has a separate metadata definition and can be handled separately.  Creating views based on tables, and creating functions for use in both tables and queries.  Consuming the results directly in business applications through interactive analytical tools such as Excel, or in corporate reporting platforms such as SQL Server Reporting Services.
  • 13. Models to Consider when Approaching a Big Data Solution Case 3 – ETL automation Choose this model when:  Extracting and transforming data before you load it into your existing databases or analytical tools.  Performing categorization and restructuring of data, and for extracting summary results to remove duplication and redundancy.  Preparing data so that it is in the appropriate format and has appropriate content to power other applications or services.
  • 14. Models to Consider when Approaching a Big Data Solution Case 4 – BI Integration Choose this model when:  You have an existing enterprise data warehouse and BI system that you want to augment with data from outside your organization.  You want to explore new ways to combine data in order to provide better insight into history and to predict future trends.  You want to give users more opportunities for self-service reporting and analysis that combines managed business data and big data from other sources.
  • 15. Overview of the Big Data Process Note that, in many ways, data analysis is an iterative process; and you should take this approach when building a big data batch processing solution. Given the large volumes of data and correspondingly long processing times typically involved in big data analysis, it can be useful to start by implementing a proof of concept iteration in which a small subset of the source data is used to validate the processing steps and results before proceeding with a full analysis. This enables you to test your big data processing design on a small cluster, or even on a single-node on-premises cluster, before scaling out to accommodate production level data volumes. The important point is that, irrespective of how you choose to use Big Data, the end result is the same: Some kind of analysis of the source data and meaningful visualization of the results.
  • 16. References  http://azure.microsoft.com/en-us/documentation/articles/hdinsight- hadoop-introduction/  https://msdn.microsoft.com/en-us/library/dn749858.aspx  https://msdn.microsoft.com/en-us/library/dn749816.aspx  http://social.technet.microsoft.com/wiki/contents/articles/13820.introductio n-to-azure-hdinsight.aspx

Editor's Notes

  1. Microsoft has a partnership deal with Hortonworks so that they get the releases/updates 1-2 months ahead of the general release