Intro To MongoDB
Buzz Moschetti
Enterprise Architect, MongoDB
Who is Talking To You?
• Yes, I use “Buzz” on my business cards
• Former Investment Bank Chief Architect at JPMorganChase
and Bear Stearns
• Over 30 years of designing and building systems
• Big and small
• Super-specialized to broadly useful in any vertical
• “Traditional” to completely disruptive
• Advocate of language leverage and strong factoring
• Inventor of perl DBI/DBD
• Not an award winner for PowerPoint
• Still programming – using emacs, of course
MongoDB: The Leading NoSQL Database
Data Model
Fully Featured
High Performance
name: “John Smith”,
pfxs: [“Dr.”,”Mr.”],
address: “10 3rd St.”,
phones: [
{ number: “555-1212”,
type: “land” },
{ number: “444-1212”,
type: “mobile” }
The best way to run
Features beyond those in the
community edition:
Enterprise-Grade Support
Commercial License
Ops Manager or Cloud Manager Premium
Encrypted & In-Memory Storage Engines
MongoDB Compass
BI Connector (SQL Bridge)
Advanced Security
Platform Certification
On-Demand Training
MongoDB Enterprise Edition
Company Vital Stats
500+ employees 2000+ customers
Over $311 million in funding
Offices in NY & Palo Alto and
across EMEA, and APAC
The Database Landscape
Database Database
Table Collection
Index Index
Row Document
Column Field
Join Embedding & Linking & $lookup
_id: “123”,
title: "MongoDB: The Definitive Guide",
authors: [
{ _id: "kchodorow", name: "Kristina Chodorow“ },
{ _id: "mdirold", name: “Mike Dirolf“ }
published_date: ISODate(”2010-09-24”),
pages: 216,
language: "English",
thumbnail: BinData(0,"AREhMQ=="),
publisher: {
name: "O’Reilly Media",
founded: 1980,
locations: ["CA”, ”NY” ]
The Data Is The Schema
> db.authors.find()
_id: ”X12",
name: { first: "Kristina”, last: “Chodorow” },
personalData: {
favoritePets: [ “bird”, “dog” ],
awards: [ {name: “Hugo”, when: 1983}, {name: “SSFX”,
when: 1992} ]
_id: ”Y45",
name: { first: ”Mike”, last: “Dirolf” } ,
personalData: {
dob: ISODate(“1970-04-05”)
Treat Your Data More Like Objects
// Java: maps
DBObject query = new BasicDBObject(”publisher.founded”, 1980));
Map m = collection.findOne(query);
Date pubDate = (Date)m.get(”published_date”); // java.util.Date
List locs = (List)m.get(”locations”);
// Javascript: objects
m = collection.findOne({”publisher.founded” : 1980});
pubDate = m.published_date; // ISODate
year = pubDate.getUTCFullYear();
# Python: dictionaries
m = coll.find_one({”publisher.founded” : 1980 });
pubDate = m[”pubDate”].year # datetime.datetime
Documents Natively Map to Language
Traditional Data Design
• Static, Uniform Scalar Data
• Rectangles
• Low-level, physical representation
Document Data Design
• Flexible, Rich Shapes
• Objects
• High-level, business representation
MongoDB 3.0 Set The Stage…
7x-10x Performance, 50%-80% Less Storage
How: WiredTiger Storage Engine
• Same data model, query language, & ops
• 100% backwards compatible API
• Non-disruptive upgrade
• Storage savings driven by native
• Write performance gains driven by
– Document-level concurrency control
– More efficient use of HW threads
• Much better ability to scale vertically
MongoDB 3.0MongoDB 2.6
MongoDB 3.2 :
Efficient Enterprise MongoDB
• Much better ability to scale vertically
• Document Validation Rules
• Encryption at rest
• BI Connector (SQL bridge)
• MongoDB Compass
• New Relic & AppDynamics integration
• Backup snapshots on filesystem
• Advanced Full-text languages
• $lookup (“left outer JOIN”)
MongoDB Sweet Spot Use Cases
Big Data
Product &
Asset Catalogs Security &
Internet of
a- Service
Single View Social &
Top Investment
and Retail Banks
Top Global
Shipping Company
Top Industrial
Top Media
Top Investment
and Retail Banks
Complex Data
Top Investment
and Retail Banks
Embedded /
Cushman &
Unpack and Start The Server
$ tar xf mongodb-osx-x86_64-enterprise-3.2.0.tgz
$ mkdir -p ~/mydb/data
$ mongodb-osx-x86_64-enterprise-3.2.0/bin/mongod 
> --dbpath ~/mydb/data 
> --logpath ~/mydb/mongod.log 
> --fork
about to fork child process, waiting until server is
ready for connections.
forked process: 6517
child process started successfully, parent exiting
Verify Operation
$ mongodb-osx-x86_64-enterprise-3.2.0/bin/mongo
MongoDB shell version: 3.2.0
connecting to:
Server has startup warnings:
2016-01-01T12:44:01.646-0500 I CONTROL [initandlisten]
2016-01-01T12:44:01.646-0500 I CONTROL [initandlisten] ** WARNING:
soft rlimits too low. Number of files is 256, should be at least
MongoDB Enterprise > use mug
switched to db mug
MongoDB Enterprise > db.foo.insert({name:”bob”,hd: new ISODate()});
MongoDB Enterprise > db.foo.insert({name:"buzz"});
MongoDB Enterprise > db.foo.insert({pets:["dog","cat"]});
MongoDB Enterprise > db.foo.find();
{ "_id" : ObjectId("5686cef538ea4981e63111dd"), "name" : "bob", "hd"
: ISODate("2016-01-01T19:09:41.442Z") }
{ "_id" : ObjectId("5686…79d5"), "name" : "buzz" }
{ "_id" : ObjectId("5686…79d6"), "pets" : [ "dog", "cat" ] }
The Simple Java App
import com.mongodb.client.*;
import com.mongodb.*;
import java.util.Map;
public class mug1 {
public static void main(String[] args) {
try {
MongoClient mongoClient = new MongoClient();
MongoDatabase db = mongoClient.getDatabase("mug”);
MongoCollection coll = db.getCollection("foo");
MongoCursor c = coll.find().iterator();
while(c.hasNext()) {
Map doc = (Map) c.next();
} catch(Exception e) {
// ...
Compile and Run!
$ curl -o mongodb-driver-3.0.4.jar
$ javac –cp mongo-java-driver-3.0.4.jar:. mug1.java
$ java –cp mongo-java-driver-3.0.4.jar:. mug1
(logger output)
Document{{_id=5686cef538ea4981e63111dd, name=bob,
hd=Fri Jan 01 14:09:41 EST 2016}}
Document{{_id=5686c71338ea4981e63111d6, name=buzz}}
Document{{_id=5686c71938ea4981e63111d7, pets=[dog, cat]}}
The Same App in python
from pymongo import MongoClient
client = MongoClient()
db = client.mug
coll = db.foo
for c in coll.find():
print c
$ python mug1.py
{u'_id': ObjectId('5686cef538ea4981e63111dd'), u'name': u'bob',
u'hd': datetime.datetime(2016, 1, 1, 19, 9, 41, 442000)}
{u'_id': ObjectId('5686f54b38ea4981e631124c'), u'name': u'buzz'}
{u'_id': ObjectId('5686f55138ea4981e631124d'), u'pets': [u'dog',
…and, as expected in Perl…
$ perl -MMongoDB -MData::Dumper –e 'my $c =
>get_collection("foo")->find(); while($c->has_next()) {
print Dumper($c->next()); }’
$VAR1 = { '_id' => bless( {'value' => '5686cef538ea4981e63111dd’},
'MongoDB::OID' ),
'hd' => bless( {
'local_rd_secs' => 68981,
'rd_nanosecs' => 442000000, // etc
}, 'DateTime' ),
'name' => 'bob'
$VAR1 = { '_id' => bless( {'value' => '5686c71338ea4981e63111d6’},
'MongoDB::OID' ),
'name' => 'buzz'
$VAR1 = { 'pets' => [ 'dog’,'cat’],
'_id' => bless( {'value' => '5686c71938ea4981e63111d7’},
'MongoDB::OID' )
Drivers A’Plenty
…and more
Document Validation: Stronger Than
> db.createCollection(”contacts", { "validator":
{ $or: [
{ $and: [ { “vers": 1},
{ ”customer_id": {$type: “string”} }
{ $and: [ { "vers": 2},
{ ”customer_id": {$type: “string”} },
{ $or: [
{ ”name.f": {$type: “string”},
”name.l": {$type: “string”}}
{ ”ssn": {$type: “string”}}
A Slightly Bigger Example
Relational MongoDB
{ vers: 1,
customer_id : 1,
name : {
“l”:"Smith” },
city : "San Francisco",
phones: [ {
number : “1-212-777-1212”,
dnc : true,
type : “home”
number : “1-212-777-1213”,
type : “cell”
First Name Last Name City
0 John Doe New York
1 Mark Smith San Francisco
2 Jay White Dallas
3 Meagan White London
4 Edward Daniels Boston
Phone Number Type DNC
1-212-555-1212 home T 0
1-212-555-1213 home T 0
1-212-555-1214 cell F 0
1-212-777-1212 home T 1
1-212-777-1213 cell (null) 1
1-212-888-1212 home F 2
MongoDB Queries Are Expressive
SQL select A.did, A.lname, A.hiredate, B.type,
B.number from contact A left outer join phones B
on (B.did = A.did) where b.type = ’home' or
A.hiredate > '2014-02-02'::date
MongoDB CLI db.contacts.find({"$or”: [
{"hiredate": {”$gt": new ISODate("2014-
Find all contacts with at least one home phone or
hired after 2014-02-02
MongoDB Aggregation Is Powerful
Sum the different types of phones and create a list
of the owners if there is more than 1 of that type
> db.contacts.aggregate([
{$unwind: "$phones"}
,{$group: {"_id": "$phones.t", "count": {$sum:1},
"names": {$push: "$name"} }}
,{$match: {"count": {$gt: 1}}}
{ "_id" : "home", "count" : 2, "names" : [
{ "f" : "John", "l" : "Doe" },
{ "f" : "Mark", "l" : "Smith" } ] }
{ "_id" : "cell", "count" : 4, "names" : [
{ "f" : "John", "l" : "Doe" },
{ "f" : "Meagan", "l" : "White" },
{ "f" : "Edward", "l" : "Daniels” }
{ "f" : "Mark", "l" : "Smith" } ] }
$lookup: “Left Outer Join++”
> db.leases.aggregate([ ]);
"_id" : ObjectId("5642559e0d4f2076a43584fc"),
"leaseID" : "A5",
"sku" : "GD652",
"origDate" : ISODate("2010-01-01T00:00:00Z"),
"histDate" : ISODate("2010-10-28T00:00:00Z"),
"monthlyDue" : 10,
"vers" : 11,
"delinq" : { "d30" : 10, "d60" : 10, "d90" : 60
"credit" : 0
// 66 more ….
Step 1: Get a sense of the raw material
$lookup: “Left Outer Join++”
Step 2: Group leases by SKU and capture count and max value of 90
day delinquency
> db.leases.aggregate([
{$group: { _id: "$sku", n:{$sum:1},
max90:{$max:"$delinq.d90"} }}
{ "_id" : "AC775", "n" : 27, "max90" : 20 }
{ "_id" : "AB123", "n" : 26, "max90" : 5 }
{ "_id" : "GD652", "n" : 14, "max90" : 80 }
$lookup: “Left Outer Join++”
Step 3: Reverse sort and then limit to the top 2
> db.leases.aggregate([
{$group: { _id: "$sku", n:{$sum:1},
max90:{$max:"$delinq.d90"} }}
,{$sort: {max90:-1}}
,{$limit: 2}
{ "_id" : "GD652", "n" : 14, "max90" : 80 }
{ "_id" : "AC775", "n" : 27, "max90" : 20 }
$lookup: “Left Outer Join++”
Step 4: $lookup to product collection and assign to new field
> db.leases.aggregate([
{$group: { _id: "$sku", n:{$sum:1},
max90:{$max:"$delinq.d90"} }}
,{$sort: {max90:-1}}
,{$limit: 2}
,{$lookup: { from: "products", localField: "_id", foreignField:
"productSKU", as:"productData"}}
{ "_id" : "GD652”, "n" : 14, "max90" : 80,
"productData" : [
"_id" : ObjectId("5642559e0d4f2076a43584b5"),
"productType" : "rigidDumptruck",
"productSKU" : "GD652",
"properties" : {
"model" : "TR45",
"payload" : {
"std" : 45,
"unit" : "UStons"
$lookup: “Left Outer Join++”
Step 5: Trim excess data away, leaving just type
> db.leases.aggregate([
{$group: { _id: "$sku", n:{$sum:1},
max90:{$max:"$delinq.d90"} }}
,{$sort: {max90:-1}}
,{$limit: 2}
,{$lookup: { from: "products", localField: "_id",
foreignField: "productSKU", as:"productData"}}
,{$project: { _id:1, n:1, max90:1, type:
• Single-click provisioning
• Scaling & upgrades
• Admin tasks
• Monitoring with charts
• Dashboards and alerts on 100+
• Backup and restore with point-in-
time recovery
• Support for sharded clusters
MongoDB Ops/Cloud Manager
MongoDB High Availability
MongoDB High Availability
The Replica Set
• 1 Primary
• 2 – 48 Secondaries
• Greatest failure isolation: Locally
attached storage (spinning or SSD)
• Less failure isolation: SAN, FLASH
Automatic, Seamless Failover
HA and DR Are Isomorphic
secondary secondary Dual Data
Center HA/DR
Replica Set
(DC3 or cloud)
Data Center 1 Data Center 2
MongoDB Scalability
What If:
1. Workload bottlenecks network or disk?
2. Data footprint starts to get large (e.g. 5TB)?
3. Regulations demand physical domicile of
data in-region?
4. Growth profile uncertain?
5. Budget prohibits buying capacity up-front?
Horizontal Scalability Through Sharding
Three Sharding Models:
1. Range
2. Tag
3. Hash
Shard 1
Symbols A-D
Shard 2
Symbols E-H
Shard n
Symbols ?-Z
For More Information
Resource Location
Case Studies mongodb.com/customers
Presentations mongodb.com/presentations
Free Online Training education.mongodb.com
Webinars and Events mongodb.com/events
Documentation docs.mongodb.org
MongoDB Downloads mongodb.com/download
Additional Info info@mongodb.com
Questions & Answers
Thank You

Recently uploaded (20)

Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
K2G - Insurtech Innovation EMEA Award 2024
K2G - Insurtech Innovation EMEA Award 2024K2G - Insurtech Innovation EMEA Award 2024
K2G - Insurtech Innovation EMEA Award 2024
Why do You Have to Redesign?_Redesign Challenge Day 1
Why do You Have to Redesign?_Redesign Challenge Day 1Why do You Have to Redesign?_Redesign Challenge Day 1
Why do You Have to Redesign?_Redesign Challenge Day 1
Interaction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance MetricInteraction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance Metric
What Not to Document and Why_ (North Bay Python 2024)
What Not to Document and Why_ (North Bay Python 2024)What Not to Document and Why_ (North Bay Python 2024)
What Not to Document and Why_ (North Bay Python 2024)
Data Protection in a Connected World: Sovereignty and Cyber Security
Data Protection in a Connected World: Sovereignty and Cyber SecurityData Protection in a Connected World: Sovereignty and Cyber Security
Data Protection in a Connected World: Sovereignty and Cyber Security
How to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory ModelHow to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory Model
Hire a private investigator to get cell phone records
Hire a private investigator to get cell phone recordsHire a private investigator to get cell phone records
Hire a private investigator to get cell phone records
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Coordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar SlidesCoordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar Slides
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
@Call @Girls Pune 0000000000 Riya Khan Beautiful Girl any Time
@Call @Girls Pune 0000000000 Riya Khan Beautiful Girl any Time@Call @Girls Pune 0000000000 Riya Khan Beautiful Girl any Time
@Call @Girls Pune 0000000000 Riya Khan Beautiful Girl any Time
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
Knowledge and Prompt Engineering Part 2 Focus on Prompt Design Approaches
Knowledge and Prompt Engineering Part 2 Focus on Prompt Design ApproachesKnowledge and Prompt Engineering Part 2 Focus on Prompt Design Approaches
Knowledge and Prompt Engineering Part 2 Focus on Prompt Design Approaches
AI_dev Europe 2024 - From OpenAI to Opensource AI
AI_dev Europe 2024 - From OpenAI to Opensource AIAI_dev Europe 2024 - From OpenAI to Opensource AI
AI_dev Europe 2024 - From OpenAI to Opensource AI

Introduction to MongoDB

  • 1. Intro To MongoDB Buzz Moschetti Enterprise Architect, MongoDB buzz.moschetti@mongodb.com @buzzmoschetti
  • 2. Who is Talking To You? • Yes, I use “Buzz” on my business cards • Former Investment Bank Chief Architect at JPMorganChase and Bear Stearns • Over 30 years of designing and building systems • Big and small • Super-specialized to broadly useful in any vertical • “Traditional” to completely disruptive • Advocate of language leverage and strong factoring • Inventor of perl DBI/DBD • Not an award winner for PowerPoint • Still programming – using emacs, of course
  • 3. Agenda • What is MongoDB? • What are some good use cases? • How do I use it? • How do I deploy it?
  • 4. MongoDB: The Leading NoSQL Database Document Data Model Open- Source Fully Featured High Performance Scalable { name: “John Smith”, pfxs: [“Dr.”,”Mr.”], address: “10 3rd St.”, phones: [ { number: “555-1212”, type: “land” }, { number: “444-1212”, type: “mobile” } ] }
  • 5. 5 The best way to run MongoDB Automated. Supported. Secured. Features beyond those in the community edition: Enterprise-Grade Support Commercial License Ops Manager or Cloud Manager Premium Encrypted & In-Memory Storage Engines MongoDB Compass BI Connector (SQL Bridge) Advanced Security Platform Certification On-Demand Training MongoDB Enterprise Edition
  • 6. Company Vital Stats 500+ employees 2000+ customers Over $311 million in funding Offices in NY & Palo Alto and across EMEA, and APAC
  • 8. RDBMS MongoDB Database Database Table Collection Index Index Row Document Column Field Join Embedding & Linking & $lookup Terminology
  • 9. { _id: “123”, title: "MongoDB: The Definitive Guide", authors: [ { _id: "kchodorow", name: "Kristina Chodorow“ }, { _id: "mdirold", name: “Mike Dirolf“ } ], published_date: ISODate(”2010-09-24”), pages: 216, language: "English", thumbnail: BinData(0,"AREhMQ=="), publisher: { name: "O’Reilly Media", founded: 1980, locations: ["CA”, ”NY” ] } } The Data Is The Schema
  • 10. > db.authors.find() { _id: ”X12", name: { first: "Kristina”, last: “Chodorow” }, personalData: { favoritePets: [ “bird”, “dog” ], awards: [ {name: “Hugo”, when: 1983}, {name: “SSFX”, when: 1992} ] } } { _id: ”Y45", name: { first: ”Mike”, last: “Dirolf” } , personalData: { dob: ISODate(“1970-04-05”) } } Treat Your Data More Like Objects
  • 11. // Java: maps DBObject query = new BasicDBObject(”publisher.founded”, 1980)); Map m = collection.findOne(query); Date pubDate = (Date)m.get(”published_date”); // java.util.Date List locs = (List)m.get(”locations”); // Javascript: objects m = collection.findOne({”publisher.founded” : 1980}); pubDate = m.published_date; // ISODate year = pubDate.getUTCFullYear(); # Python: dictionaries m = coll.find_one({”publisher.founded” : 1980 }); pubDate = m[”pubDate”].year # datetime.datetime Documents Natively Map to Language
  • 12. 12 Traditional Data Design • Static, Uniform Scalar Data • Rectangles • Low-level, physical representation
  • 13. 13 Document Data Design • Flexible, Rich Shapes • Objects • High-level, business representation
  • 14. Agenda • What is MongoDB? • What are some good use cases? • How do I use it? • How do I deploy it?
  • 15. 15 MongoDB 3.0 Set The Stage… 7x-10x Performance, 50%-80% Less Storage How: WiredTiger Storage Engine • Same data model, query language, & ops • 100% backwards compatible API • Non-disruptive upgrade • Storage savings driven by native compression • Write performance gains driven by – Document-level concurrency control – More efficient use of HW threads • Much better ability to scale vertically MongoDB 3.0MongoDB 2.6 Performance
  • 16. 16 MongoDB 3.2 : Efficient Enterprise MongoDB • Much better ability to scale vertically + • Document Validation Rules • Encryption at rest • BI Connector (SQL bridge) • MongoDB Compass • New Relic & AppDynamics integration • Backup snapshots on filesystem • Advanced Full-text languages • $lookup (“left outer JOIN”) More general-purpose solutions
  • 17. 17 MongoDB Sweet Spot Use Cases Big Data Product & Asset Catalogs Security & Fraud Internet of Things Database-as- a- Service Mobile Apps Customer Data Management Single View Social & Collaboration Content Management Intelligence Agencies Top Investment and Retail Banks Top Global Shipping Company Top Industrial Equipment Manufacturer Top Media Company Top Investment and Retail Banks Complex Data Management Top Investment and Retail Banks Embedded / ISV Cushman & Wakefield
  • 18. Agenda • What is MongoDB? • What are some good use cases? • How do I use it? • How do I deploy it?
  • 20. 20 Unpack and Start The Server $ tar xf mongodb-osx-x86_64-enterprise-3.2.0.tgz $ mkdir -p ~/mydb/data $ mongodb-osx-x86_64-enterprise-3.2.0/bin/mongod > --dbpath ~/mydb/data > --logpath ~/mydb/mongod.log > --fork about to fork child process, waiting until server is ready for connections. forked process: 6517 child process started successfully, parent exiting
  • 21. 21 Verify Operation $ mongodb-osx-x86_64-enterprise-3.2.0/bin/mongo MongoDB shell version: 3.2.0 connecting to: Server has startup warnings: 2016-01-01T12:44:01.646-0500 I CONTROL [initandlisten] 2016-01-01T12:44:01.646-0500 I CONTROL [initandlisten] ** WARNING: soft rlimits too low. Number of files is 256, should be at least 1000 MongoDB Enterprise > use mug switched to db mug MongoDB Enterprise > db.foo.insert({name:”bob”,hd: new ISODate()}); MongoDB Enterprise > db.foo.insert({name:"buzz"}); MongoDB Enterprise > db.foo.insert({pets:["dog","cat"]}); MongoDB Enterprise > db.foo.find(); { "_id" : ObjectId("5686cef538ea4981e63111dd"), "name" : "bob", "hd" : ISODate("2016-01-01T19:09:41.442Z") } { "_id" : ObjectId("5686…79d5"), "name" : "buzz" } { "_id" : ObjectId("5686…79d6"), "pets" : [ "dog", "cat" ] }
  • 22. 22 The Simple Java App import com.mongodb.client.*; import com.mongodb.*; import java.util.Map; public class mug1 { public static void main(String[] args) { try { MongoClient mongoClient = new MongoClient(); MongoDatabase db = mongoClient.getDatabase("mug”); MongoCollection coll = db.getCollection("foo"); MongoCursor c = coll.find().iterator(); while(c.hasNext()) { Map doc = (Map) c.next(); System.out.println(doc); } } catch(Exception e) { // ... } } }
  • 23. 23 Compile and Run! $ curl -o mongodb-driver-3.0.4.jar https://oss.sonatype.org/content/repositories/releases/org /mongodb/mongodb-driver/3.0.4/mongodb-driver-3.0.4.jar $ javac –cp mongo-java-driver-3.0.4.jar:. mug1.java $ java –cp mongo-java-driver-3.0.4.jar:. mug1 (logger output) Document{{_id=5686cef538ea4981e63111dd, name=bob, hd=Fri Jan 01 14:09:41 EST 2016}} Document{{_id=5686c71338ea4981e63111d6, name=buzz}} Document{{_id=5686c71938ea4981e63111d7, pets=[dog, cat]}}
  • 24. 24 The Same App in python from pymongo import MongoClient client = MongoClient() db = client.mug coll = db.foo for c in coll.find(): print c $ python mug1.py {u'_id': ObjectId('5686cef538ea4981e63111dd'), u'name': u'bob', u'hd': datetime.datetime(2016, 1, 1, 19, 9, 41, 442000)} {u'_id': ObjectId('5686f54b38ea4981e631124c'), u'name': u'buzz'} {u'_id': ObjectId('5686f55138ea4981e631124d'), u'pets': [u'dog', u'cat']}
  • 25. 25 …and, as expected in Perl… $ perl -MMongoDB -MData::Dumper –e 'my $c = MongoDB::MongoClient->new()->get_database("mug")- >get_collection("foo")->find(); while($c->has_next()) { print Dumper($c->next()); }’ $VAR1 = { '_id' => bless( {'value' => '5686cef538ea4981e63111dd’}, 'MongoDB::OID' ), 'hd' => bless( { 'local_rd_secs' => 68981, 'rd_nanosecs' => 442000000, // etc }, 'DateTime' ), 'name' => 'bob' }; $VAR1 = { '_id' => bless( {'value' => '5686c71338ea4981e63111d6’}, 'MongoDB::OID' ), 'name' => 'buzz' }; $VAR1 = { 'pets' => [ 'dog’,'cat’], '_id' => bless( {'value' => '5686c71938ea4981e63111d7’}, 'MongoDB::OID' ) };
  • 27. Document Validation: Stronger Than Schema…? > db.createCollection(”contacts", { "validator": { $or: [ { $and: [ { “vers": 1}, { ”customer_id": {$type: “string”} } ] }, { $and: [ { "vers": 2}, { ”customer_id": {$type: “string”} }, { $or: [ { ”name.f": {$type: “string”}, ”name.l": {$type: “string”}} , { ”ssn": {$type: “string”}} ] } ] }]});
  • 28. A Slightly Bigger Example Relational MongoDB { vers: 1, customer_id : 1, name : { “f”:"Mark”, “l”:"Smith” }, city : "San Francisco", phones: [ { number : “1-212-777-1212”, dnc : true, type : “home” }, { number : “1-212-777-1213”, type : “cell” }] } Customer ID First Name Last Name City 0 John Doe New York 1 Mark Smith San Francisco 2 Jay White Dallas 3 Meagan White London 4 Edward Daniels Boston Phone Number Type DNC Customer ID 1-212-555-1212 home T 0 1-212-555-1213 home T 0 1-212-555-1214 cell F 0 1-212-777-1212 home T 1 1-212-777-1213 cell (null) 1 1-212-888-1212 home F 2
  • 29. 29 MongoDB Queries Are Expressive SQL select A.did, A.lname, A.hiredate, B.type, B.number from contact A left outer join phones B on (B.did = A.did) where b.type = ’home' or A.hiredate > '2014-02-02'::date MongoDB CLI db.contacts.find({"$or”: [ {"phones.type":”home”}, {"hiredate": {”$gt": new ISODate("2014- 02-02")}} ]}); Find all contacts with at least one home phone or hired after 2014-02-02
  • 30. 30 MongoDB Aggregation Is Powerful Sum the different types of phones and create a list of the owners if there is more than 1 of that type > db.contacts.aggregate([ {$unwind: "$phones"} ,{$group: {"_id": "$phones.t", "count": {$sum:1}, "names": {$push: "$name"} }} ,{$match: {"count": {$gt: 1}}} ]); { "_id" : "home", "count" : 2, "names" : [ { "f" : "John", "l" : "Doe" }, { "f" : "Mark", "l" : "Smith" } ] } { "_id" : "cell", "count" : 4, "names" : [ { "f" : "John", "l" : "Doe" }, { "f" : "Meagan", "l" : "White" }, { "f" : "Edward", "l" : "Daniels” } { "f" : "Mark", "l" : "Smith" } ] }
  • 31. 31 $lookup: “Left Outer Join++” > db.leases.aggregate([ ]); { "_id" : ObjectId("5642559e0d4f2076a43584fc"), "leaseID" : "A5", "sku" : "GD652", "origDate" : ISODate("2010-01-01T00:00:00Z"), "histDate" : ISODate("2010-10-28T00:00:00Z"), "monthlyDue" : 10, "vers" : 11, "delinq" : { "d30" : 10, "d60" : 10, "d90" : 60 }, "credit" : 0 } // 66 more …. Step 1: Get a sense of the raw material
  • 32. 32 $lookup: “Left Outer Join++” Step 2: Group leases by SKU and capture count and max value of 90 day delinquency > db.leases.aggregate([ {$group: { _id: "$sku", n:{$sum:1}, max90:{$max:"$delinq.d90"} }} ]); { "_id" : "AC775", "n" : 27, "max90" : 20 } { "_id" : "AB123", "n" : 26, "max90" : 5 } { "_id" : "GD652", "n" : 14, "max90" : 80 }
  • 33. 33 $lookup: “Left Outer Join++” Step 3: Reverse sort and then limit to the top 2 > db.leases.aggregate([ {$group: { _id: "$sku", n:{$sum:1}, max90:{$max:"$delinq.d90"} }} ,{$sort: {max90:-1}} ,{$limit: 2} ]); { "_id" : "GD652", "n" : 14, "max90" : 80 } { "_id" : "AC775", "n" : 27, "max90" : 20 }
  • 34. 34 $lookup: “Left Outer Join++” Step 4: $lookup to product collection and assign to new field > db.leases.aggregate([ {$group: { _id: "$sku", n:{$sum:1}, max90:{$max:"$delinq.d90"} }} ,{$sort: {max90:-1}} ,{$limit: 2} ,{$lookup: { from: "products", localField: "_id", foreignField: "productSKU", as:"productData"}} ]); { "_id" : "GD652”, "n" : 14, "max90" : 80, "productData" : [ { "_id" : ObjectId("5642559e0d4f2076a43584b5"), "productType" : "rigidDumptruck", "productSKU" : "GD652", "properties" : { "model" : "TR45", "payload" : { "std" : 45, "unit" : "UStons" },
  • 35. 35 $lookup: “Left Outer Join++” Step 5: Trim excess data away, leaving just type > db.leases.aggregate([ {$group: { _id: "$sku", n:{$sum:1}, max90:{$max:"$delinq.d90"} }} ,{$sort: {max90:-1}} ,{$limit: 2} ,{$lookup: { from: "products", localField: "_id", foreignField: "productSKU", as:"productData"}} ,{$project: { _id:1, n:1, max90:1, type: "$productData.productType"}} ]); {"_id”:"GD652”,"n”:14,"max90”:80,"type”:[”dumptruck”]} {"_id”:"AC775”,"n”:27,"max90”:20,"type”:["loader”]}
  • 36. Agenda • What is MongoDB? • What are some good use cases? • How do I use it? • How do I deploy it?
  • 37. 37 • Single-click provisioning • Scaling & upgrades • Admin tasks • Monitoring with charts • Dashboards and alerts on 100+ metrics • Backup and restore with point-in- time recovery • Support for sharded clusters MongoDB Ops/Cloud Manager
  • 39. 39 MongoDB High Availability PRIMARY Application DRIVER secondary secondary The Replica Set • 1 Primary • 2 – 48 Secondaries • Greatest failure isolation: Locally attached storage (spinning or SSD) • Less failure isolation: SAN, FLASH
  • 41. 41 HA and DR Are Isomorphic PRIMARY Application DRIVER secondary secondary Dual Data Center HA/DR Replica Set secondary Arbiter (DC3 or cloud) Data Center 1 Data Center 2
  • 42. 42 MongoDB Scalability PRIMARY Application DRIVER secondary secondary What If: 1. Workload bottlenecks network or disk? 2. Data footprint starts to get large (e.g. 5TB)? 3. Regulations demand physical domicile of data in-region? 4. Growth profile uncertain? 5. Budget prohibits buying capacity up-front?
  • 43. 43 Horizontal Scalability Through Sharding PRIMARY Application DRIVER secondary secondary PRIMARY secondary secondary PRIMARY secondary secondary mongos Three Sharding Models: 1. Range 2. Tag 3. Hash … Shard 1 Symbols A-D Shard 2 Symbols E-H Shard n Symbols ?-Z
  • 44. 44 For More Information Resource Location Case Studies mongodb.com/customers Presentations mongodb.com/presentations Free Online Training education.mongodb.com Webinars and Events mongodb.com/events Documentation docs.mongodb.org MongoDB Downloads mongodb.com/download Additional Info info@mongodb.com

  36. On behalf of all of us at MongoDB , thank you for attending this webinar! I hope what you saw and heard today gave you some insight and clues into what you might face in your own data design efforts. Remember you can always reach out to us at MongoDB for guidance. With that, code well and be well.