(Go: >> BACK << -|- >> HOME <<)

SlideShare a Scribd company logo
High Availability Explained
Maciej Lasyk
Kraków, devOPS meetup #2
2014-01-28

Maciej Lasyk, High Availability Explained

1/14
“Anything that can go wrong, will go wrong”
Murphy's law

Maciej Lasyk, High Availability Explained

2/14
“Anything that can go wrong, will go wrong”
Murphy's law

Maciej Lasyk, High Availability Explained

2/14
“Anything that can go wrong, will go wrong”
Murphy's law

An electrical explosion and fire Saturday at a Houston data
center operated by The Planet has taken the entire facility offline.
The company claimed power to the facility was interrupted when a
transformer exploded. Official reports that three walls were blown
down causing a fire.

Maciej Lasyk, High Availability Explained

2/14
“Anything that can go wrong, will go wrong”
Murphy's law

An electrical explosion and fire Saturday at a Houston data
center operated by The Planet has taken the entire facility offline.
The company claimed power to the facility was interrupted when a
transformer exploded. Official reports that three walls were blown
down causing a fire.

Three walls of the electrical equipment room on the first floor
blew several feet from their original position, and the underground
cabling that powers the first floor of H1 was destroyed.

Maciej Lasyk, High Availability Explained

2/14
High Availability is in the eye of the beholder

Maciej Lasyk, High Availability Explained

3/14
High Availability is in the eye of the beholder
CEO: we don't loose sales

Maciej Lasyk, High Availability Explained

3/14
High Availability is in the eye of the beholder
CEO: we don't loose sales
Sales: we can extend our offer basing on HA level

Maciej Lasyk, High Availability Explained

3/14
High Availability is in the eye of the beholder
CEO: we don't loose sales
Sales: we can extend our offer basing on HA level
Accounts managers: we don't upset our customers (that often)

Maciej Lasyk, High Availability Explained

3/14
High Availability is in the eye of the beholder
CEO: we don't loose sales
Sales: we can extend our offer basing on HA level
Accounts managers: we don't upset our customers (that often)
Developers: we can be proud – our services are working ;)

Maciej Lasyk, High Availability Explained

3/14
High Availability is in the eye of the beholder
CEO: we don't loose sales
Sales: we can extend our offer basing on HA level
Accounts managers: we don't upset our customers (that often)
Developers: we can be proud – our services are working ;)
System engineers: we can sleep well (and fsck, we love to!)

Maciej Lasyk, High Availability Explained

3/14
High Availability is in the eye of the beholder
CEO: we don't loose sales
Sales: we can extend our offer basing on HA level
Accounts managers: we don't upset our customers (that often)
Developers: we can be proud – our services are working ;)
System engineers: we can sleep well (and fsck, we love to!)
Technical support: no calls? Back to WoW then.. ;)

Maciej Lasyk, High Availability Explained

3/14
So how many 9's?

Maciej Lasyk, High Availability Explained

4/14
So how many 9's?

Maciej Lasyk, High Availability Explained

4/14
So how many 9's?

Monthly: 1 hour of outage means 100% - 0.13888 ~= 99.86112 of availability

Maciej Lasyk, High Availability Explained

4/14
So how many 9's?

Monthly: 1 hour of outage means 100% - 0.13888 ~= 99.86112 of availability
Yearly: 1 hour of outage means 100% - 0.01142 ~= 99.98858 of availability

Maciej Lasyk, High Availability Explained

4/14
So how many 9's?

Monthly: 1 hour of outage means 100% - 0.13888 ~= 99.86112 of availability
Yearly: 1 hour of outage means 100% - 0.01142 ~= 99.98858 of availability
Availability

Downtime (year)

Downtime (month)

90% (“one nine”)

36.5 days

72 hours

95%

18.25 days

36 hours

97%

10.96 days

21.6 hours

98%

7.30 days

14.4 hours

99% (“two nines”)

3.65 days

7.2 hours

99.5%

1.83 days

3.6 hours

99.8%

17.52 hours

86.23 minutes

99.9% (“three nines”)

4.38 hours

21.56 minutes

99.99 (“four nines”)

52.56 minutes

4.32 minutes

99.999 (“five nines”)

5.26 minutes

25.9 seconds

Maciej Lasyk, High Availability Explained

4/14
So how many 9's?

https://jazz.net/wiki/bin/view/Deployment/HighAvailability

Maciej Lasyk, High Availability Explained

4/14
HA terminology
RPO: Recovery Point Objective; how much data can we loose?

Maciej Lasyk, High Availability Explained

5/14
HA terminology
RPO: Recovery Point Objective; how much data can we loose?
RTO: Recovery Time Objective; how long does it take to recover?

Maciej Lasyk, High Availability Explained

5/14
HA terminology
RPO: Recovery Point Objective; how much data can we loose?
RTO: Recovery Time Objective; how long does it take to recover?
MTBF: Mean-Times-Between-Failures; time between failures
(density fnc -> reliability fnc)

https://en.wikipedia.org/wiki/Mean_time_between_failures

Maciej Lasyk, High Availability Explained

5/14
HA terminology
SLA: Service Level Agreement;
formal definitions (customer <-> provider)

Maciej Lasyk, High Availability Explained

5/14
HA terminology
SLA: Service Level Agreement;
formal definitions (customer <-> provider)
OLA: Operational Level Agreement; definitions within organization;
help us keeping provided SLAs

Maciej Lasyk, High Availability Explained

5/14
SLAs..
So what is written in SLAs?
Availability

Downtime (year)

Downtime (month)

90%

36.5 days

72 hours

95%

18.25 days

36 hours

97%

10.96 days

21.6 hours

98%

7.30 days

14.4 hours

99%

3.65 days

7.2 hours

99.5% (EC2, EBS)

1.83 days

3.6 hours

99.8%

17.52 hours

86.23 minutes

99.9% (SoftLayer, IBM)

4.38 hours

21.56 minutes

99.99

52.56 minutes

4.32 minutes

99.999

5.26 minutes

25.9 seconds

Maciej Lasyk, High Availability Explained

5/14
SLAs..
So what is written in SLAs?
Availability

Downtime (year)

Downtime (month)

90%

36.5 days

72 hours

95%

18.25 days

36 hours

97%

10.96 days

21.6 hours

98%

7.30 days

14.4 hours

99%

3.65 days

7.2 hours

99.5% (EC2, EBS)

1.83 days

3.6 hours

99.8%

17.52 hours

86.23 minutes

99.9% (SoftLayer, IBM)

4.38 hours

21.56 minutes

99.99

52.56 minutes

4.32 minutes

99.999

5.26 minutes

25.9 seconds

http://aws.amazon.com/ec2/sla/
http://www.softlayer.com/about/service-level-agreement
Maciej Lasyk, High Availability Explained

5/14
SLAs..

Availability mentioned in SLAs are only goals of service provider
Usually when it's not met than company pays off the fees

Maciej Lasyk, High Availability Explained

5/14
How deep is this hole?
app layer (core, db, cache)
data storage
operating system
hardware
networking
location
So we would like to achieve 99,9999% which is about 30s of downtime per year
Maciej Lasyk, High Availability Explained

6/14
How deep is this hole?
app layer (core, db, cache)
data storage
operating system
hardware
networking
location
Even Proof of Concept is very hard to provide: 5s of downtime per layer yearly!
Maciej Lasyk, High Availability Explained

6/14
Load-balancing and failover

LB:

http://www.netdigix.com/linux-loadbalancing.php

Maciej Lasyk, High Availability Explained

7/14
Load-balancing and failover

Failover:

http://www.simplefailover.com/
Maciej Lasyk, High Availability Explained

7/14
th

th

LB – 4 layer or 7 ?

4th layer:

7th layer:

- high performance

- low cost

- just do the LB work!

- good for quickfixes / patches

- reliable

- not that scalable

- scalable

- low performance
- complex codebase
- custom code for protocols
- cookies? what about memcache..

Maciej Lasyk, High Availability Explained

8/14
Disaster Recovery

Maciej Lasyk, High Availability Explained

9/14
Disaster Recovery

http://disasterrecovery.starwindsoftware.com/planning-disaster-recovery-for-virtualized-environments

Maciej Lasyk, High Availability Explained

9/14
Disaster Recovery

http://disasterrecovery.starwindsoftware.com/planning-disaster-recovery-for-virtualized-environments

Hot site: active synchronization, could be serving services. Cost can be high
Warm site: periodical synchronization, DR tests needed. Low costs
Cold site: Nothing here – just echo and some place to spin services; nightmare
Maciej Lasyk, High Availability Explained

9/14
Planning for failure

Maciej Lasyk, High Availability Explained

10/14
Planning for failure
Everything starts here - DNS:
- keep TTLs low (300s). Can't make under 60min? That's bad!
- check SLA of DNS servers (dnsmadeeasy.com history)
- what do you know about DNSes?
- zero downtime here is a must!
- this can be achieved with complicated network abracadabra
- remember what 99.9999% means?
- round robin is a load – balancer but without failover!
- GSLB – killed by OS/browser/srvs cache'ing
(GlobalServerLoadBalancing)
- GlobalIP (SoftLayer etc) – workaround for GSLB via routing

Maciej Lasyk, High Availability Explained

10/14
Planning for failure
E-mail servers:
- it's simple as MX records (delivering)
- it's almost simple as complicated system of SMTP servers (sending)
- it's not that simple when IMAP locking over DFS (reading)

5 gmail-smtp-in.l.google.com.
10 alt1.gmail-smtp-in.l.google.com.
20 alt2.gmail-smtp-in.l.google.com.
30 alt3.gmail-smtp-in.l.google.com.
40 alt4.gmail-smtp-in.l.google.com.
When MXing – watch the spam!

Maciej Lasyk, High Availability Explained

10/14
Planning for failure
WEB servers:
- it's simple as some frontend loadbalancer
- did you really stick user session to particular server? Memcache!
- LB balancing algorithm
- how many Lbs?
- what if LB goes down?

Maciej Lasyk, High Availability Explained

10/14
Planning for failure
DB servers:
- it's.. not that simple
- replication (master – master? App should be aware..)
- replication ring? Complicated, works, but in case of failure...
- let's talk about MySQL:
- NoSPOF solution: MySQL cluster
- MySQL Galera cluster – synch, active-active multi-master
- master – master – simply works
- Failover? Matsunobu Yoshinori mysql-master-ha
- MySQL utilities (http://www.clusterdb.com/mysql/mysql-utilities-webinar-qa-replay-now-available/)

Maciej Lasyk, High Availability Explained

10/14
Planning for failure
Caching servers:
- this is cache for God's sake – why would we use HA here?
- just use proper architecture like... redundancy.

Maciej Lasyk, High Availability Explained

10/14
Planning for failure
Caching servers:
- this is cache for God's sake – why would we use HA here?
- just use proper architecture like... redundancy.

Load – balancers:
- remember about failovering IP addresses!

Maciej Lasyk, High Availability Explained

10/14
Planning for failure
Caching servers:
- this is cache for God's sake – why would we use HA here?
- just use proper architecture like... redundancy.

Load – balancers:
- remember about failovering IP addresses!
Storage – DFSes:
- GlusterFS – we'll see it in action in a minute
- NFS? Could be – over some SAN / NAS (high cost solution)
- CephFS – just like GlusterFS – it's great and does the work
- DRBD – lower level, does the work on block – device layer – slow...
Maciej Lasyk, High Availability Explained

10/14
Planning for failure
GlusterFS:
- low cost (could be..)
- distributed volumes
- replicated volumes
- striped volumes
- and...
- distributed – striped volumes
- distributed – replicated volumes
- distributed – striped – replicated volumes
- sound good? :)
Maciej Lasyk, High Availability Explained

10/14
Planning for failure
GlusterFS: replicated volumes vs Geo-replication
- replicated:
- mirrors data
- provides HA
- synch – replication
- Geo-replication:
- mirrors data across geo – distributed clusters
- ensures backing up data for DR
- asynch – replica (periodic checks)

Maciej Lasyk, High Availability Explained

10/14
Planning for failure
HA for virtualization solutions?
- it's really complicated, like...

Maciej Lasyk, High Availability Explained

11/14
Planning for failure
HA for virtualization solutions?
- it's really complicated, like...

Maciej Lasyk, High Availability Explained

11/14
Tools
The most important tool would be the conclusion from the picture below:

Maciej Lasyk, High Availability Explained

12/14
Tools
The most important tool would be the conclusion from the picture below:

Maciej Lasyk, High Availability Explained

12/14
Tools
The most important tool would be the conclusion from the picture below:

Maciej Lasyk, High Availability Explained

12/14
Tools
- DNS: roundrobin, GSLB, low ttls, globalIP

Maciej Lasyk, High Availability Explained

12/14
Tools
- DNS: roundrobin, GSLB, low ttls, globalIP
- Load-Balancers (l7, stateless services)): HaProxy, Pound, Nginx

Maciej Lasyk, High Availability Explained

12/14
Tools
- DNS: roundrobin, GSLB, low ttls, globalIP
- Load-Balancers (l7, stateless services)): HaProxy, Pound, Nginx
- Failover (statefull services):
- IP: KeepAlived + sysctl

Maciej Lasyk, High Availability Explained

12/14
Tools
- DNS: roundrobin, GSLB, low ttls, globalIP
- Load-Balancers (l7, stateless services)): HaProxy, Pound, Nginx
- Failover (statefull services):
- IP: KeepAlived + sysctl
- Managing: pacemaker (manager) + corosync (message'ing)

Maciej Lasyk, High Availability Explained

12/14
Tools
- DNS: roundrobin, GSLB, low ttls, globalIP
- Load-Balancers (l7, stateless services)): HaProxy, Pound, Nginx
- Failover (statefull services):
- IP: KeepAlived + sysctl
- Managing: pacemaker (manager) + corosync (message'ing)
- (almost) All-In-One: Linux Virtual Server

Maciej Lasyk, High Availability Explained

12/14
Turn on HA thinking!
Main goal of HA? Improve user experience!
- keep the app fully functional
- keep the app resistant and tolerant to faults
- provide method for a successful audit
- sleep well (anyone awake?) ;)

Maciej Lasyk, High Availability Explained

13/14
Thank you :)
High Availability Explained
Maciej Lasyk
Kraków, devOPS meetup #2
2014-01-28
http://maciek.lasyk.info/sysop
maciek@lasyk.info
@docent-net

Maciej Lasyk, High Availability Explained

14/14

More Related Content

What's hot

[SC03] Active Directory の DR 対策~天災/人災/サイバー攻撃、その時あなたの IT 基盤は利用継続できますか?
[SC03] Active Directory の DR 対策~天災/人災/サイバー攻撃、その時あなたの IT 基盤は利用継続できますか? [SC03] Active Directory の DR 対策~天災/人災/サイバー攻撃、その時あなたの IT 基盤は利用継続できますか?
[SC03] Active Directory の DR 対策~天災/人災/サイバー攻撃、その時あなたの IT 基盤は利用継続できますか?
de:code 2017
 
大規模微服務導入 - #2 從零開始的微服務 .NET Core 框架設計
大規模微服務導入 - #2 從零開始的微服務 .NET Core 框架設計大規模微服務導入 - #2 從零開始的微服務 .NET Core 框架設計
大規模微服務導入 - #2 從零開始的微服務 .NET Core 框架設計
Andrew Wu
 
Load Balance with NSX-T.pptx
Load Balance with NSX-T.pptxLoad Balance with NSX-T.pptx
Load Balance with NSX-T.pptx
Dhruv Sharma
 
Db2 & Db2 Warehouse v11.5.4 最新情報アップデート2020年8月25日
Db2 & Db2 Warehouse v11.5.4 最新情報アップデート2020年8月25日Db2 & Db2 Warehouse v11.5.4 最新情報アップデート2020年8月25日
Db2 & Db2 Warehouse v11.5.4 最新情報アップデート2020年8月25日
IBM Analytics Japan
 
VMWARE VS MS-HYPER-V
VMWARE VS MS-HYPER-VVMWARE VS MS-HYPER-V
VMWARE VS MS-HYPER-V
David Ramirez
 
Quic을 이용한 네트워크 성능 개선
 Quic을 이용한 네트워크 성능 개선 Quic을 이용한 네트워크 성능 개선
Quic을 이용한 네트워크 성능 개선
NAVER D2
 
Linux crontab
Linux crontabLinux crontab
Linux crontab
Teja Bheemanapally
 
KAFKA 3.1.0.pdf
KAFKA 3.1.0.pdfKAFKA 3.1.0.pdf
KAFKA 3.1.0.pdf
wonyong hwang
 
Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...
Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...
Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...
Amazon Web Services
 
PostgreSQLセキュリティ総復習
PostgreSQLセキュリティ総復習PostgreSQLセキュリティ総復習
PostgreSQLセキュリティ総復習
Uptime Technologies LLC (JP)
 
Virtualization Vs. Containers
Virtualization Vs. ContainersVirtualization Vs. Containers
Virtualization Vs. Containers
actualtechmedia
 
Monitoring in CloudStack
Monitoring in CloudStackMonitoring in CloudStack
Monitoring in CloudStack
ShapeBlue
 
Vitualisation
VitualisationVitualisation
Vitualisation
Priya_Srivastava
 
Connecting mq&amp;kafka
Connecting mq&amp;kafkaConnecting mq&amp;kafka
Connecting mq&amp;kafka
Matt Leming
 
Prometheus on NKS
Prometheus on NKSPrometheus on NKS
Prometheus on NKS
Jo Hoon
 
Docker 101: Introduction to Docker
Docker 101: Introduction to DockerDocker 101: Introduction to Docker
Docker 101: Introduction to Docker
Docker, Inc.
 
サポート エンジニアが語る、Microsoft Azure を支えるインフラの秘密
サポート エンジニアが語る、Microsoft Azure を支えるインフラの秘密サポート エンジニアが語る、Microsoft Azure を支えるインフラの秘密
サポート エンジニアが語る、Microsoft Azure を支えるインフラの秘密
ShuheiUda
 
Apache Kafka - Patterns anti-patterns
Apache Kafka - Patterns anti-patternsApache Kafka - Patterns anti-patterns
Apache Kafka - Patterns anti-patterns
Florent Ramiere
 
Introduction To Docker, Docker Compose, Docker Swarm
Introduction To Docker, Docker Compose, Docker SwarmIntroduction To Docker, Docker Compose, Docker Swarm
Introduction To Docker, Docker Compose, Docker Swarm
An Nguyen
 
From Zero to Docker
From Zero to DockerFrom Zero to Docker
From Zero to Docker
Abhishek Verma
 

What's hot (20)

[SC03] Active Directory の DR 対策~天災/人災/サイバー攻撃、その時あなたの IT 基盤は利用継続できますか?
[SC03] Active Directory の DR 対策~天災/人災/サイバー攻撃、その時あなたの IT 基盤は利用継続できますか? [SC03] Active Directory の DR 対策~天災/人災/サイバー攻撃、その時あなたの IT 基盤は利用継続できますか?
[SC03] Active Directory の DR 対策~天災/人災/サイバー攻撃、その時あなたの IT 基盤は利用継続できますか?
 
大規模微服務導入 - #2 從零開始的微服務 .NET Core 框架設計
大規模微服務導入 - #2 從零開始的微服務 .NET Core 框架設計大規模微服務導入 - #2 從零開始的微服務 .NET Core 框架設計
大規模微服務導入 - #2 從零開始的微服務 .NET Core 框架設計
 
Load Balance with NSX-T.pptx
Load Balance with NSX-T.pptxLoad Balance with NSX-T.pptx
Load Balance with NSX-T.pptx
 
Db2 & Db2 Warehouse v11.5.4 最新情報アップデート2020年8月25日
Db2 & Db2 Warehouse v11.5.4 最新情報アップデート2020年8月25日Db2 & Db2 Warehouse v11.5.4 最新情報アップデート2020年8月25日
Db2 & Db2 Warehouse v11.5.4 最新情報アップデート2020年8月25日
 
VMWARE VS MS-HYPER-V
VMWARE VS MS-HYPER-VVMWARE VS MS-HYPER-V
VMWARE VS MS-HYPER-V
 
Quic을 이용한 네트워크 성능 개선
 Quic을 이용한 네트워크 성능 개선 Quic을 이용한 네트워크 성능 개선
Quic을 이용한 네트워크 성능 개선
 
Linux crontab
Linux crontabLinux crontab
Linux crontab
 
KAFKA 3.1.0.pdf
KAFKA 3.1.0.pdfKAFKA 3.1.0.pdf
KAFKA 3.1.0.pdf
 
Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...
Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...
Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...
 
PostgreSQLセキュリティ総復習
PostgreSQLセキュリティ総復習PostgreSQLセキュリティ総復習
PostgreSQLセキュリティ総復習
 
Virtualization Vs. Containers
Virtualization Vs. ContainersVirtualization Vs. Containers
Virtualization Vs. Containers
 
Monitoring in CloudStack
Monitoring in CloudStackMonitoring in CloudStack
Monitoring in CloudStack
 
Vitualisation
VitualisationVitualisation
Vitualisation
 
Connecting mq&amp;kafka
Connecting mq&amp;kafkaConnecting mq&amp;kafka
Connecting mq&amp;kafka
 
Prometheus on NKS
Prometheus on NKSPrometheus on NKS
Prometheus on NKS
 
Docker 101: Introduction to Docker
Docker 101: Introduction to DockerDocker 101: Introduction to Docker
Docker 101: Introduction to Docker
 
サポート エンジニアが語る、Microsoft Azure を支えるインフラの秘密
サポート エンジニアが語る、Microsoft Azure を支えるインフラの秘密サポート エンジニアが語る、Microsoft Azure を支えるインフラの秘密
サポート エンジニアが語る、Microsoft Azure を支えるインフラの秘密
 
Apache Kafka - Patterns anti-patterns
Apache Kafka - Patterns anti-patternsApache Kafka - Patterns anti-patterns
Apache Kafka - Patterns anti-patterns
 
Introduction To Docker, Docker Compose, Docker Swarm
Introduction To Docker, Docker Compose, Docker SwarmIntroduction To Docker, Docker Compose, Docker Swarm
Introduction To Docker, Docker Compose, Docker Swarm
 
From Zero to Docker
From Zero to DockerFrom Zero to Docker
From Zero to Docker
 

Similar to High Availability (HA) Explained

High Availability (HA) Explained - second edition
High Availability (HA) Explained - second editionHigh Availability (HA) Explained - second edition
High Availability (HA) Explained - second edition
Maciej Lasyk
 
Bulletproof Kafka with Fault Tree Analysis (Andrey Falko, Lyft) Kafka Summit ...
Bulletproof Kafka with Fault Tree Analysis (Andrey Falko, Lyft) Kafka Summit ...Bulletproof Kafka with Fault Tree Analysis (Andrey Falko, Lyft) Kafka Summit ...
Bulletproof Kafka with Fault Tree Analysis (Andrey Falko, Lyft) Kafka Summit ...
confluent
 
Bulletproof Kafka with Fault Tree Analysis (Andrey Falko, Lyft) Kafka Summit ...
Bulletproof Kafka with Fault Tree Analysis (Andrey Falko, Lyft) Kafka Summit ...Bulletproof Kafka with Fault Tree Analysis (Andrey Falko, Lyft) Kafka Summit ...
Bulletproof Kafka with Fault Tree Analysis (Andrey Falko, Lyft) Kafka Summit ...
confluent
 
NoSQL Revolution: Under the Covers of Distributed Systems at Scale (SPOT401) ...
NoSQL Revolution: Under the Covers of Distributed Systems at Scale (SPOT401) ...NoSQL Revolution: Under the Covers of Distributed Systems at Scale (SPOT401) ...
NoSQL Revolution: Under the Covers of Distributed Systems at Scale (SPOT401) ...
Amazon Web Services
 
Scaling your Kafka streaming pipeline can be a pain - but it doesn’t have to ...
Scaling your Kafka streaming pipeline can be a pain - but it doesn’t have to ...Scaling your Kafka streaming pipeline can be a pain - but it doesn’t have to ...
Scaling your Kafka streaming pipeline can be a pain - but it doesn’t have to ...
HostedbyConfluent
 
MySQL InnoDB Cluster - Group Replication
MySQL InnoDB Cluster - Group ReplicationMySQL InnoDB Cluster - Group Replication
MySQL InnoDB Cluster - Group Replication
Frederic Descamps
 
Supercharge Your Applications
Supercharge Your ApplicationsSupercharge Your Applications
Supercharge Your Applications
Sean Boiling
 
Оптимизация MySQL. Что должен знать каждый разработчик
Оптимизация MySQL. Что должен знать каждый разработчикОптимизация MySQL. Что должен знать каждый разработчик
Оптимизация MySQL. Что должен знать каждый разработчик
Agnislav Onufrijchuk
 
Cloud cost optimization (AWS, GCP)
Cloud cost optimization (AWS, GCP)Cloud cost optimization (AWS, GCP)
Cloud cost optimization (AWS, GCP)
Szabolcs Zajdó
 
YOW! Data Keynote (2021)
YOW! Data Keynote (2021)YOW! Data Keynote (2021)
YOW! Data Keynote (2021)
Sid Anand
 
Continues Deployment - Tech Talk week
Continues Deployment - Tech Talk weekContinues Deployment - Tech Talk week
Continues Deployment - Tech Talk week
rantav
 
Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...
Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...
Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...
HostedbyConfluent
 
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
ScyllaDB
 
Microservices 5 things i wish i'd known code motion
Microservices 5 things i wish i'd known   code motionMicroservices 5 things i wish i'd known   code motion
Microservices 5 things i wish i'd known code motion
Vincent Kok
 
Microservices: 5 things I wish I'd known - Vincent Kok - Codemotion Amsterdam...
Microservices: 5 things I wish I'd known - Vincent Kok - Codemotion Amsterdam...Microservices: 5 things I wish I'd known - Vincent Kok - Codemotion Amsterdam...
Microservices: 5 things I wish I'd known - Vincent Kok - Codemotion Amsterdam...
Codemotion
 
Java one2013 con4540-keenan
Java one2013 con4540-keenanJava one2013 con4540-keenan
Java one2013 con4540-keenan
ddkeenan
 
Science Of Saving With AWS Reserved Instances - 9/11/14
Science Of Saving With AWS Reserved Instances - 9/11/14Science Of Saving With AWS Reserved Instances - 9/11/14
Science Of Saving With AWS Reserved Instances - 9/11/14
Cloudability
 
Using Kubernetes to deliver a “serverless” service
Using Kubernetes to deliver a “serverless” serviceUsing Kubernetes to deliver a “serverless” service
Using Kubernetes to deliver a “serverless” service
DoKC
 
Présentation du FME World Tour du 13 avril 2017 à Quebec
Présentation du FME World Tour du 13 avril 2017 à QuebecPrésentation du FME World Tour du 13 avril 2017 à Quebec
Présentation du FME World Tour du 13 avril 2017 à Quebec
Guillaume Genest
 
Improving Agility (Learning from Maersk Line's Journey) | Özlem Yüce | Agile ...
Improving Agility (Learning from Maersk Line's Journey) | Özlem Yüce | Agile ...Improving Agility (Learning from Maersk Line's Journey) | Özlem Yüce | Agile ...
Improving Agility (Learning from Maersk Line's Journey) | Özlem Yüce | Agile ...
Agile Greece
 

Similar to High Availability (HA) Explained (20)

High Availability (HA) Explained - second edition
High Availability (HA) Explained - second editionHigh Availability (HA) Explained - second edition
High Availability (HA) Explained - second edition
 
Bulletproof Kafka with Fault Tree Analysis (Andrey Falko, Lyft) Kafka Summit ...
Bulletproof Kafka with Fault Tree Analysis (Andrey Falko, Lyft) Kafka Summit ...Bulletproof Kafka with Fault Tree Analysis (Andrey Falko, Lyft) Kafka Summit ...
Bulletproof Kafka with Fault Tree Analysis (Andrey Falko, Lyft) Kafka Summit ...
 
Bulletproof Kafka with Fault Tree Analysis (Andrey Falko, Lyft) Kafka Summit ...
Bulletproof Kafka with Fault Tree Analysis (Andrey Falko, Lyft) Kafka Summit ...Bulletproof Kafka with Fault Tree Analysis (Andrey Falko, Lyft) Kafka Summit ...
Bulletproof Kafka with Fault Tree Analysis (Andrey Falko, Lyft) Kafka Summit ...
 
NoSQL Revolution: Under the Covers of Distributed Systems at Scale (SPOT401) ...
NoSQL Revolution: Under the Covers of Distributed Systems at Scale (SPOT401) ...NoSQL Revolution: Under the Covers of Distributed Systems at Scale (SPOT401) ...
NoSQL Revolution: Under the Covers of Distributed Systems at Scale (SPOT401) ...
 
Scaling your Kafka streaming pipeline can be a pain - but it doesn’t have to ...
Scaling your Kafka streaming pipeline can be a pain - but it doesn’t have to ...Scaling your Kafka streaming pipeline can be a pain - but it doesn’t have to ...
Scaling your Kafka streaming pipeline can be a pain - but it doesn’t have to ...
 
MySQL InnoDB Cluster - Group Replication
MySQL InnoDB Cluster - Group ReplicationMySQL InnoDB Cluster - Group Replication
MySQL InnoDB Cluster - Group Replication
 
Supercharge Your Applications
Supercharge Your ApplicationsSupercharge Your Applications
Supercharge Your Applications
 
Оптимизация MySQL. Что должен знать каждый разработчик
Оптимизация MySQL. Что должен знать каждый разработчикОптимизация MySQL. Что должен знать каждый разработчик
Оптимизация MySQL. Что должен знать каждый разработчик
 
Cloud cost optimization (AWS, GCP)
Cloud cost optimization (AWS, GCP)Cloud cost optimization (AWS, GCP)
Cloud cost optimization (AWS, GCP)
 
YOW! Data Keynote (2021)
YOW! Data Keynote (2021)YOW! Data Keynote (2021)
YOW! Data Keynote (2021)
 
Continues Deployment - Tech Talk week
Continues Deployment - Tech Talk weekContinues Deployment - Tech Talk week
Continues Deployment - Tech Talk week
 
Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...
Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...
Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...
 
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
 
Microservices 5 things i wish i'd known code motion
Microservices 5 things i wish i'd known   code motionMicroservices 5 things i wish i'd known   code motion
Microservices 5 things i wish i'd known code motion
 
Microservices: 5 things I wish I'd known - Vincent Kok - Codemotion Amsterdam...
Microservices: 5 things I wish I'd known - Vincent Kok - Codemotion Amsterdam...Microservices: 5 things I wish I'd known - Vincent Kok - Codemotion Amsterdam...
Microservices: 5 things I wish I'd known - Vincent Kok - Codemotion Amsterdam...
 
Java one2013 con4540-keenan
Java one2013 con4540-keenanJava one2013 con4540-keenan
Java one2013 con4540-keenan
 
Science Of Saving With AWS Reserved Instances - 9/11/14
Science Of Saving With AWS Reserved Instances - 9/11/14Science Of Saving With AWS Reserved Instances - 9/11/14
Science Of Saving With AWS Reserved Instances - 9/11/14
 
Using Kubernetes to deliver a “serverless” service
Using Kubernetes to deliver a “serverless” serviceUsing Kubernetes to deliver a “serverless” service
Using Kubernetes to deliver a “serverless” service
 
Présentation du FME World Tour du 13 avril 2017 à Quebec
Présentation du FME World Tour du 13 avril 2017 à QuebecPrésentation du FME World Tour du 13 avril 2017 à Quebec
Présentation du FME World Tour du 13 avril 2017 à Quebec
 
Improving Agility (Learning from Maersk Line's Journey) | Özlem Yüce | Agile ...
Improving Agility (Learning from Maersk Line's Journey) | Özlem Yüce | Agile ...Improving Agility (Learning from Maersk Line's Journey) | Özlem Yüce | Agile ...
Improving Agility (Learning from Maersk Line's Journey) | Özlem Yüce | Agile ...
 

More from Maciej Lasyk

Rundeck & Ansible
Rundeck & AnsibleRundeck & Ansible
Rundeck & Ansible
Maciej Lasyk
 
Docker 1.11
Docker 1.11Docker 1.11
Docker 1.11
Maciej Lasyk
 
Programowanie AWSa z CLI, boto, Ansiblem i libcloudem
Programowanie AWSa z CLI, boto, Ansiblem i libcloudemProgramowanie AWSa z CLI, boto, Ansiblem i libcloudem
Programowanie AWSa z CLI, boto, Ansiblem i libcloudem
Maciej Lasyk
 
Co powinieneś wiedzieć na temat devops?f
Co powinieneś wiedzieć na temat devops?f Co powinieneś wiedzieć na temat devops?f
Co powinieneś wiedzieć na temat devops?f
Maciej Lasyk
 
"Containers do not contain"
"Containers do not contain""Containers do not contain"
"Containers do not contain"
Maciej Lasyk
 
Git Submodules
Git SubmodulesGit Submodules
Git Submodules
Maciej Lasyk
 
Linux containers & Devops
Linux containers & DevopsLinux containers & Devops
Linux containers & Devops
Maciej Lasyk
 
Under the Dome (of failure driven pipeline)
Under the Dome (of failure driven pipeline)Under the Dome (of failure driven pipeline)
Under the Dome (of failure driven pipeline)
Maciej Lasyk
 
Continuous Security in DevOps
Continuous Security in DevOpsContinuous Security in DevOps
Continuous Security in DevOps
Maciej Lasyk
 
About cultural change w/Devops
About cultural change w/DevopsAbout cultural change w/Devops
About cultural change w/Devops
Maciej Lasyk
 
Orchestrating docker containers at scale (#DockerKRK edition)
Orchestrating docker containers at scale (#DockerKRK edition)Orchestrating docker containers at scale (#DockerKRK edition)
Orchestrating docker containers at scale (#DockerKRK edition)
Maciej Lasyk
 
Orchestrating docker containers at scale (PJUG edition)
Orchestrating docker containers at scale (PJUG edition)Orchestrating docker containers at scale (PJUG edition)
Orchestrating docker containers at scale (PJUG edition)
Maciej Lasyk
 
Orchestrating Docker containers at scale
Orchestrating Docker containers at scaleOrchestrating Docker containers at scale
Orchestrating Docker containers at scale
Maciej Lasyk
 
Ghost in the shell
Ghost in the shellGhost in the shell
Ghost in the shell
Maciej Lasyk
 
Scaling and securing node.js apps
Scaling and securing node.js appsScaling and securing node.js apps
Scaling and securing node.js apps
Maciej Lasyk
 
Node.js security
Node.js securityNode.js security
Node.js security
Maciej Lasyk
 
Monitoring with Nagios and Ganglia
Monitoring with Nagios and GangliaMonitoring with Nagios and Ganglia
Monitoring with Nagios and Ganglia
Maciej Lasyk
 
Stop disabling SELinux!
Stop disabling SELinux!Stop disabling SELinux!
Stop disabling SELinux!
Maciej Lasyk
 
RHEL/Fedora + Docker (and SELinux)
RHEL/Fedora + Docker (and SELinux)RHEL/Fedora + Docker (and SELinux)
RHEL/Fedora + Docker (and SELinux)
Maciej Lasyk
 
Shall we play a game? PL version
Shall we play a game? PL versionShall we play a game? PL version
Shall we play a game? PL version
Maciej Lasyk
 

More from Maciej Lasyk (20)

Rundeck & Ansible
Rundeck & AnsibleRundeck & Ansible
Rundeck & Ansible
 
Docker 1.11
Docker 1.11Docker 1.11
Docker 1.11
 
Programowanie AWSa z CLI, boto, Ansiblem i libcloudem
Programowanie AWSa z CLI, boto, Ansiblem i libcloudemProgramowanie AWSa z CLI, boto, Ansiblem i libcloudem
Programowanie AWSa z CLI, boto, Ansiblem i libcloudem
 
Co powinieneś wiedzieć na temat devops?f
Co powinieneś wiedzieć na temat devops?f Co powinieneś wiedzieć na temat devops?f
Co powinieneś wiedzieć na temat devops?f
 
"Containers do not contain"
"Containers do not contain""Containers do not contain"
"Containers do not contain"
 
Git Submodules
Git SubmodulesGit Submodules
Git Submodules
 
Linux containers & Devops
Linux containers & DevopsLinux containers & Devops
Linux containers & Devops
 
Under the Dome (of failure driven pipeline)
Under the Dome (of failure driven pipeline)Under the Dome (of failure driven pipeline)
Under the Dome (of failure driven pipeline)
 
Continuous Security in DevOps
Continuous Security in DevOpsContinuous Security in DevOps
Continuous Security in DevOps
 
About cultural change w/Devops
About cultural change w/DevopsAbout cultural change w/Devops
About cultural change w/Devops
 
Orchestrating docker containers at scale (#DockerKRK edition)
Orchestrating docker containers at scale (#DockerKRK edition)Orchestrating docker containers at scale (#DockerKRK edition)
Orchestrating docker containers at scale (#DockerKRK edition)
 
Orchestrating docker containers at scale (PJUG edition)
Orchestrating docker containers at scale (PJUG edition)Orchestrating docker containers at scale (PJUG edition)
Orchestrating docker containers at scale (PJUG edition)
 
Orchestrating Docker containers at scale
Orchestrating Docker containers at scaleOrchestrating Docker containers at scale
Orchestrating Docker containers at scale
 
Ghost in the shell
Ghost in the shellGhost in the shell
Ghost in the shell
 
Scaling and securing node.js apps
Scaling and securing node.js appsScaling and securing node.js apps
Scaling and securing node.js apps
 
Node.js security
Node.js securityNode.js security
Node.js security
 
Monitoring with Nagios and Ganglia
Monitoring with Nagios and GangliaMonitoring with Nagios and Ganglia
Monitoring with Nagios and Ganglia
 
Stop disabling SELinux!
Stop disabling SELinux!Stop disabling SELinux!
Stop disabling SELinux!
 
RHEL/Fedora + Docker (and SELinux)
RHEL/Fedora + Docker (and SELinux)RHEL/Fedora + Docker (and SELinux)
RHEL/Fedora + Docker (and SELinux)
 
Shall we play a game? PL version
Shall we play a game? PL versionShall we play a game? PL version
Shall we play a game? PL version
 

Recently uploaded

9 Ways Pastors Will Use AI Everyday By 2029
9 Ways Pastors Will Use AI Everyday By 20299 Ways Pastors Will Use AI Everyday By 2029
9 Ways Pastors Will Use AI Everyday By 2029
Big Click Syndicate LLC
 
Dev Dives: Mining your data with AI-powered Continuous Discovery
Dev Dives: Mining your data with AI-powered Continuous DiscoveryDev Dives: Mining your data with AI-powered Continuous Discovery
Dev Dives: Mining your data with AI-powered Continuous Discovery
UiPathCommunity
 
CNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My Identity
CNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My IdentityCNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My Identity
CNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My Identity
Cynthia Thomas
 
Chapter 2 - Testing Throughout SDLC V4.0
Chapter 2 - Testing Throughout SDLC V4.0Chapter 2 - Testing Throughout SDLC V4.0
Chapter 2 - Testing Throughout SDLC V4.0
Neeraj Kumar Singh
 
Chapter 3 - Static Testing (Review) V4.0
Chapter 3 - Static Testing (Review) V4.0Chapter 3 - Static Testing (Review) V4.0
Chapter 3 - Static Testing (Review) V4.0
Neeraj Kumar Singh
 
Quantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLMQuantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLM
Vijayananda Mohire
 
How to Optimize Call Monitoring: Automate QA and Elevate Customer Experience
How to Optimize Call Monitoring: Automate QA and Elevate Customer ExperienceHow to Optimize Call Monitoring: Automate QA and Elevate Customer Experience
How to Optimize Call Monitoring: Automate QA and Elevate Customer Experience
Aggregage
 
Metadata Lakes for Next-Gen AI/ML - Datastrato
Metadata Lakes for Next-Gen AI/ML - DatastratoMetadata Lakes for Next-Gen AI/ML - Datastrato
Metadata Lakes for Next-Gen AI/ML - Datastrato
Zilliz
 
Summer24-ReleaseOverviewDeck - Stephen Stanley 27 June 2024.pdf
Summer24-ReleaseOverviewDeck - Stephen Stanley 27 June 2024.pdfSummer24-ReleaseOverviewDeck - Stephen Stanley 27 June 2024.pdf
Summer24-ReleaseOverviewDeck - Stephen Stanley 27 June 2024.pdf
Anna Loughnan Colquhoun
 
Building an Agentic RAG locally with Ollama and Milvus
Building an Agentic RAG locally with Ollama and MilvusBuilding an Agentic RAG locally with Ollama and Milvus
Building an Agentic RAG locally with Ollama and Milvus
Zilliz
 
Kubernetes Cloud Native Indonesia Meetup - June 2024
Kubernetes Cloud Native Indonesia Meetup - June 2024Kubernetes Cloud Native Indonesia Meetup - June 2024
Kubernetes Cloud Native Indonesia Meetup - June 2024
Prasta Maha
 
Product Listing Optimization Presentation - Gay De La Cruz.pdf
Product Listing Optimization Presentation - Gay De La Cruz.pdfProduct Listing Optimization Presentation - Gay De La Cruz.pdf
Product Listing Optimization Presentation - Gay De La Cruz.pdf
gaydlc2513
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
Aurora Consulting
 
An Introduction to All Data Enterprise Integration
An Introduction to All Data Enterprise IntegrationAn Introduction to All Data Enterprise Integration
An Introduction to All Data Enterprise Integration
Safe Software
 
Supercomputing from the Desktop Workstation
Supercomputingfrom the Desktop WorkstationSupercomputingfrom the Desktop Workstation
Supercomputing from the Desktop Workstation
Larry Smarr
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions
 
APJC Introduction to ThousandEyes Webinar
APJC Introduction to ThousandEyes WebinarAPJC Introduction to ThousandEyes Webinar
APJC Introduction to ThousandEyes Webinar
ThousandEyes
 
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
BookNet Canada
 
Database Management Myths for Developers
Database Management Myths for DevelopersDatabase Management Myths for Developers
Database Management Myths for Developers
John Sterrett
 
Chapter 5 - Managing Test Activities V4.0
Chapter 5 - Managing Test Activities V4.0Chapter 5 - Managing Test Activities V4.0
Chapter 5 - Managing Test Activities V4.0
Neeraj Kumar Singh
 

Recently uploaded (20)

9 Ways Pastors Will Use AI Everyday By 2029
9 Ways Pastors Will Use AI Everyday By 20299 Ways Pastors Will Use AI Everyday By 2029
9 Ways Pastors Will Use AI Everyday By 2029
 
Dev Dives: Mining your data with AI-powered Continuous Discovery
Dev Dives: Mining your data with AI-powered Continuous DiscoveryDev Dives: Mining your data with AI-powered Continuous Discovery
Dev Dives: Mining your data with AI-powered Continuous Discovery
 
CNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My Identity
CNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My IdentityCNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My Identity
CNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My Identity
 
Chapter 2 - Testing Throughout SDLC V4.0
Chapter 2 - Testing Throughout SDLC V4.0Chapter 2 - Testing Throughout SDLC V4.0
Chapter 2 - Testing Throughout SDLC V4.0
 
Chapter 3 - Static Testing (Review) V4.0
Chapter 3 - Static Testing (Review) V4.0Chapter 3 - Static Testing (Review) V4.0
Chapter 3 - Static Testing (Review) V4.0
 
Quantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLMQuantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLM
 
How to Optimize Call Monitoring: Automate QA and Elevate Customer Experience
How to Optimize Call Monitoring: Automate QA and Elevate Customer ExperienceHow to Optimize Call Monitoring: Automate QA and Elevate Customer Experience
How to Optimize Call Monitoring: Automate QA and Elevate Customer Experience
 
Metadata Lakes for Next-Gen AI/ML - Datastrato
Metadata Lakes for Next-Gen AI/ML - DatastratoMetadata Lakes for Next-Gen AI/ML - Datastrato
Metadata Lakes for Next-Gen AI/ML - Datastrato
 
Summer24-ReleaseOverviewDeck - Stephen Stanley 27 June 2024.pdf
Summer24-ReleaseOverviewDeck - Stephen Stanley 27 June 2024.pdfSummer24-ReleaseOverviewDeck - Stephen Stanley 27 June 2024.pdf
Summer24-ReleaseOverviewDeck - Stephen Stanley 27 June 2024.pdf
 
Building an Agentic RAG locally with Ollama and Milvus
Building an Agentic RAG locally with Ollama and MilvusBuilding an Agentic RAG locally with Ollama and Milvus
Building an Agentic RAG locally with Ollama and Milvus
 
Kubernetes Cloud Native Indonesia Meetup - June 2024
Kubernetes Cloud Native Indonesia Meetup - June 2024Kubernetes Cloud Native Indonesia Meetup - June 2024
Kubernetes Cloud Native Indonesia Meetup - June 2024
 
Product Listing Optimization Presentation - Gay De La Cruz.pdf
Product Listing Optimization Presentation - Gay De La Cruz.pdfProduct Listing Optimization Presentation - Gay De La Cruz.pdf
Product Listing Optimization Presentation - Gay De La Cruz.pdf
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
 
An Introduction to All Data Enterprise Integration
An Introduction to All Data Enterprise IntegrationAn Introduction to All Data Enterprise Integration
An Introduction to All Data Enterprise Integration
 
Supercomputing from the Desktop Workstation
Supercomputingfrom the Desktop WorkstationSupercomputingfrom the Desktop Workstation
Supercomputing from the Desktop Workstation
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
 
APJC Introduction to ThousandEyes Webinar
APJC Introduction to ThousandEyes WebinarAPJC Introduction to ThousandEyes Webinar
APJC Introduction to ThousandEyes Webinar
 
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
 
Database Management Myths for Developers
Database Management Myths for DevelopersDatabase Management Myths for Developers
Database Management Myths for Developers
 
Chapter 5 - Managing Test Activities V4.0
Chapter 5 - Managing Test Activities V4.0Chapter 5 - Managing Test Activities V4.0
Chapter 5 - Managing Test Activities V4.0
 

High Availability (HA) Explained

  • 1. High Availability Explained Maciej Lasyk Kraków, devOPS meetup #2 2014-01-28 Maciej Lasyk, High Availability Explained 1/14
  • 2. “Anything that can go wrong, will go wrong” Murphy's law Maciej Lasyk, High Availability Explained 2/14
  • 3. “Anything that can go wrong, will go wrong” Murphy's law Maciej Lasyk, High Availability Explained 2/14
  • 4. “Anything that can go wrong, will go wrong” Murphy's law An electrical explosion and fire Saturday at a Houston data center operated by The Planet has taken the entire facility offline. The company claimed power to the facility was interrupted when a transformer exploded. Official reports that three walls were blown down causing a fire. Maciej Lasyk, High Availability Explained 2/14
  • 5. “Anything that can go wrong, will go wrong” Murphy's law An electrical explosion and fire Saturday at a Houston data center operated by The Planet has taken the entire facility offline. The company claimed power to the facility was interrupted when a transformer exploded. Official reports that three walls were blown down causing a fire. Three walls of the electrical equipment room on the first floor blew several feet from their original position, and the underground cabling that powers the first floor of H1 was destroyed. Maciej Lasyk, High Availability Explained 2/14
  • 6. High Availability is in the eye of the beholder Maciej Lasyk, High Availability Explained 3/14
  • 7. High Availability is in the eye of the beholder CEO: we don't loose sales Maciej Lasyk, High Availability Explained 3/14
  • 8. High Availability is in the eye of the beholder CEO: we don't loose sales Sales: we can extend our offer basing on HA level Maciej Lasyk, High Availability Explained 3/14
  • 9. High Availability is in the eye of the beholder CEO: we don't loose sales Sales: we can extend our offer basing on HA level Accounts managers: we don't upset our customers (that often) Maciej Lasyk, High Availability Explained 3/14
  • 10. High Availability is in the eye of the beholder CEO: we don't loose sales Sales: we can extend our offer basing on HA level Accounts managers: we don't upset our customers (that often) Developers: we can be proud – our services are working ;) Maciej Lasyk, High Availability Explained 3/14
  • 11. High Availability is in the eye of the beholder CEO: we don't loose sales Sales: we can extend our offer basing on HA level Accounts managers: we don't upset our customers (that often) Developers: we can be proud – our services are working ;) System engineers: we can sleep well (and fsck, we love to!) Maciej Lasyk, High Availability Explained 3/14
  • 12. High Availability is in the eye of the beholder CEO: we don't loose sales Sales: we can extend our offer basing on HA level Accounts managers: we don't upset our customers (that often) Developers: we can be proud – our services are working ;) System engineers: we can sleep well (and fsck, we love to!) Technical support: no calls? Back to WoW then.. ;) Maciej Lasyk, High Availability Explained 3/14
  • 13. So how many 9's? Maciej Lasyk, High Availability Explained 4/14
  • 14. So how many 9's? Maciej Lasyk, High Availability Explained 4/14
  • 15. So how many 9's? Monthly: 1 hour of outage means 100% - 0.13888 ~= 99.86112 of availability Maciej Lasyk, High Availability Explained 4/14
  • 16. So how many 9's? Monthly: 1 hour of outage means 100% - 0.13888 ~= 99.86112 of availability Yearly: 1 hour of outage means 100% - 0.01142 ~= 99.98858 of availability Maciej Lasyk, High Availability Explained 4/14
  • 17. So how many 9's? Monthly: 1 hour of outage means 100% - 0.13888 ~= 99.86112 of availability Yearly: 1 hour of outage means 100% - 0.01142 ~= 99.98858 of availability Availability Downtime (year) Downtime (month) 90% (“one nine”) 36.5 days 72 hours 95% 18.25 days 36 hours 97% 10.96 days 21.6 hours 98% 7.30 days 14.4 hours 99% (“two nines”) 3.65 days 7.2 hours 99.5% 1.83 days 3.6 hours 99.8% 17.52 hours 86.23 minutes 99.9% (“three nines”) 4.38 hours 21.56 minutes 99.99 (“four nines”) 52.56 minutes 4.32 minutes 99.999 (“five nines”) 5.26 minutes 25.9 seconds Maciej Lasyk, High Availability Explained 4/14
  • 18. So how many 9's? https://jazz.net/wiki/bin/view/Deployment/HighAvailability Maciej Lasyk, High Availability Explained 4/14
  • 19. HA terminology RPO: Recovery Point Objective; how much data can we loose? Maciej Lasyk, High Availability Explained 5/14
  • 20. HA terminology RPO: Recovery Point Objective; how much data can we loose? RTO: Recovery Time Objective; how long does it take to recover? Maciej Lasyk, High Availability Explained 5/14
  • 21. HA terminology RPO: Recovery Point Objective; how much data can we loose? RTO: Recovery Time Objective; how long does it take to recover? MTBF: Mean-Times-Between-Failures; time between failures (density fnc -> reliability fnc) https://en.wikipedia.org/wiki/Mean_time_between_failures Maciej Lasyk, High Availability Explained 5/14
  • 22. HA terminology SLA: Service Level Agreement; formal definitions (customer <-> provider) Maciej Lasyk, High Availability Explained 5/14
  • 23. HA terminology SLA: Service Level Agreement; formal definitions (customer <-> provider) OLA: Operational Level Agreement; definitions within organization; help us keeping provided SLAs Maciej Lasyk, High Availability Explained 5/14
  • 24. SLAs.. So what is written in SLAs? Availability Downtime (year) Downtime (month) 90% 36.5 days 72 hours 95% 18.25 days 36 hours 97% 10.96 days 21.6 hours 98% 7.30 days 14.4 hours 99% 3.65 days 7.2 hours 99.5% (EC2, EBS) 1.83 days 3.6 hours 99.8% 17.52 hours 86.23 minutes 99.9% (SoftLayer, IBM) 4.38 hours 21.56 minutes 99.99 52.56 minutes 4.32 minutes 99.999 5.26 minutes 25.9 seconds Maciej Lasyk, High Availability Explained 5/14
  • 25. SLAs.. So what is written in SLAs? Availability Downtime (year) Downtime (month) 90% 36.5 days 72 hours 95% 18.25 days 36 hours 97% 10.96 days 21.6 hours 98% 7.30 days 14.4 hours 99% 3.65 days 7.2 hours 99.5% (EC2, EBS) 1.83 days 3.6 hours 99.8% 17.52 hours 86.23 minutes 99.9% (SoftLayer, IBM) 4.38 hours 21.56 minutes 99.99 52.56 minutes 4.32 minutes 99.999 5.26 minutes 25.9 seconds http://aws.amazon.com/ec2/sla/ http://www.softlayer.com/about/service-level-agreement Maciej Lasyk, High Availability Explained 5/14
  • 26. SLAs.. Availability mentioned in SLAs are only goals of service provider Usually when it's not met than company pays off the fees Maciej Lasyk, High Availability Explained 5/14
  • 27. How deep is this hole? app layer (core, db, cache) data storage operating system hardware networking location So we would like to achieve 99,9999% which is about 30s of downtime per year Maciej Lasyk, High Availability Explained 6/14
  • 28. How deep is this hole? app layer (core, db, cache) data storage operating system hardware networking location Even Proof of Concept is very hard to provide: 5s of downtime per layer yearly! Maciej Lasyk, High Availability Explained 6/14
  • 31. th th LB – 4 layer or 7 ? 4th layer: 7th layer: - high performance - low cost - just do the LB work! - good for quickfixes / patches - reliable - not that scalable - scalable - low performance - complex codebase - custom code for protocols - cookies? what about memcache.. Maciej Lasyk, High Availability Explained 8/14
  • 32. Disaster Recovery Maciej Lasyk, High Availability Explained 9/14
  • 34. Disaster Recovery http://disasterrecovery.starwindsoftware.com/planning-disaster-recovery-for-virtualized-environments Hot site: active synchronization, could be serving services. Cost can be high Warm site: periodical synchronization, DR tests needed. Low costs Cold site: Nothing here – just echo and some place to spin services; nightmare Maciej Lasyk, High Availability Explained 9/14
  • 35. Planning for failure Maciej Lasyk, High Availability Explained 10/14
  • 36. Planning for failure Everything starts here - DNS: - keep TTLs low (300s). Can't make under 60min? That's bad! - check SLA of DNS servers (dnsmadeeasy.com history) - what do you know about DNSes? - zero downtime here is a must! - this can be achieved with complicated network abracadabra - remember what 99.9999% means? - round robin is a load – balancer but without failover! - GSLB – killed by OS/browser/srvs cache'ing (GlobalServerLoadBalancing) - GlobalIP (SoftLayer etc) – workaround for GSLB via routing Maciej Lasyk, High Availability Explained 10/14
  • 37. Planning for failure E-mail servers: - it's simple as MX records (delivering) - it's almost simple as complicated system of SMTP servers (sending) - it's not that simple when IMAP locking over DFS (reading) 5 gmail-smtp-in.l.google.com. 10 alt1.gmail-smtp-in.l.google.com. 20 alt2.gmail-smtp-in.l.google.com. 30 alt3.gmail-smtp-in.l.google.com. 40 alt4.gmail-smtp-in.l.google.com. When MXing – watch the spam! Maciej Lasyk, High Availability Explained 10/14
  • 38. Planning for failure WEB servers: - it's simple as some frontend loadbalancer - did you really stick user session to particular server? Memcache! - LB balancing algorithm - how many Lbs? - what if LB goes down? Maciej Lasyk, High Availability Explained 10/14
  • 39. Planning for failure DB servers: - it's.. not that simple - replication (master – master? App should be aware..) - replication ring? Complicated, works, but in case of failure... - let's talk about MySQL: - NoSPOF solution: MySQL cluster - MySQL Galera cluster – synch, active-active multi-master - master – master – simply works - Failover? Matsunobu Yoshinori mysql-master-ha - MySQL utilities (http://www.clusterdb.com/mysql/mysql-utilities-webinar-qa-replay-now-available/) Maciej Lasyk, High Availability Explained 10/14
  • 40. Planning for failure Caching servers: - this is cache for God's sake – why would we use HA here? - just use proper architecture like... redundancy. Maciej Lasyk, High Availability Explained 10/14
  • 41. Planning for failure Caching servers: - this is cache for God's sake – why would we use HA here? - just use proper architecture like... redundancy. Load – balancers: - remember about failovering IP addresses! Maciej Lasyk, High Availability Explained 10/14
  • 42. Planning for failure Caching servers: - this is cache for God's sake – why would we use HA here? - just use proper architecture like... redundancy. Load – balancers: - remember about failovering IP addresses! Storage – DFSes: - GlusterFS – we'll see it in action in a minute - NFS? Could be – over some SAN / NAS (high cost solution) - CephFS – just like GlusterFS – it's great and does the work - DRBD – lower level, does the work on block – device layer – slow... Maciej Lasyk, High Availability Explained 10/14
  • 43. Planning for failure GlusterFS: - low cost (could be..) - distributed volumes - replicated volumes - striped volumes - and... - distributed – striped volumes - distributed – replicated volumes - distributed – striped – replicated volumes - sound good? :) Maciej Lasyk, High Availability Explained 10/14
  • 44. Planning for failure GlusterFS: replicated volumes vs Geo-replication - replicated: - mirrors data - provides HA - synch – replication - Geo-replication: - mirrors data across geo – distributed clusters - ensures backing up data for DR - asynch – replica (periodic checks) Maciej Lasyk, High Availability Explained 10/14
  • 45. Planning for failure HA for virtualization solutions? - it's really complicated, like... Maciej Lasyk, High Availability Explained 11/14
  • 46. Planning for failure HA for virtualization solutions? - it's really complicated, like... Maciej Lasyk, High Availability Explained 11/14
  • 47. Tools The most important tool would be the conclusion from the picture below: Maciej Lasyk, High Availability Explained 12/14
  • 48. Tools The most important tool would be the conclusion from the picture below: Maciej Lasyk, High Availability Explained 12/14
  • 49. Tools The most important tool would be the conclusion from the picture below: Maciej Lasyk, High Availability Explained 12/14
  • 50. Tools - DNS: roundrobin, GSLB, low ttls, globalIP Maciej Lasyk, High Availability Explained 12/14
  • 51. Tools - DNS: roundrobin, GSLB, low ttls, globalIP - Load-Balancers (l7, stateless services)): HaProxy, Pound, Nginx Maciej Lasyk, High Availability Explained 12/14
  • 52. Tools - DNS: roundrobin, GSLB, low ttls, globalIP - Load-Balancers (l7, stateless services)): HaProxy, Pound, Nginx - Failover (statefull services): - IP: KeepAlived + sysctl Maciej Lasyk, High Availability Explained 12/14
  • 53. Tools - DNS: roundrobin, GSLB, low ttls, globalIP - Load-Balancers (l7, stateless services)): HaProxy, Pound, Nginx - Failover (statefull services): - IP: KeepAlived + sysctl - Managing: pacemaker (manager) + corosync (message'ing) Maciej Lasyk, High Availability Explained 12/14
  • 54. Tools - DNS: roundrobin, GSLB, low ttls, globalIP - Load-Balancers (l7, stateless services)): HaProxy, Pound, Nginx - Failover (statefull services): - IP: KeepAlived + sysctl - Managing: pacemaker (manager) + corosync (message'ing) - (almost) All-In-One: Linux Virtual Server Maciej Lasyk, High Availability Explained 12/14
  • 55. Turn on HA thinking! Main goal of HA? Improve user experience! - keep the app fully functional - keep the app resistant and tolerant to faults - provide method for a successful audit - sleep well (anyone awake?) ;) Maciej Lasyk, High Availability Explained 13/14
  • 56. Thank you :) High Availability Explained Maciej Lasyk Kraków, devOPS meetup #2 2014-01-28 http://maciek.lasyk.info/sysop maciek@lasyk.info @docent-net Maciej Lasyk, High Availability Explained 14/14