Introducing TiDB - Percona Live Frankfurt

•

0 likes•359 views

TiDB is an open-source distributed SQL database developed by PingCAP that is compatible with MySQL. It provides horizontal scalability, high availability, and consistent distributed transactions. Mobike, which has 200 million users and 9 million bikes, uses TiDB to handle over 30 TB of data per day. While TiDB aims to be compatible with MySQL, some features like stored procedures work differently or are still in development.

What's hot

Presentation at SF Kubernetes Meetup (10/30/18), Introducing TiDB/TiKV

Kevin Xu

Presto Summit 2018 - 07 - Lyft

kbajda

Presto Summit 2018 - 09 - Netflix Iceberg

kbajda

Introducing Scylla Open Source 4.0

ScyllaDB

Since its inception, Scylla has offered a compelling alternative to Apache Cassandra, providing better performance for a lower cost of ownership. With Scylla Open Source 4.0 we continue to extend our CQL interface features and capabilities and also now provide an open source alternative to DynamoDB, allowing you to run your workloads anywhere, on any cloud provider, or on premises. Join ScyllaDB co-founders, CTO Avi Kivity and CEO Dor Laor, for a look at the new features in Scylla Open Source 4.0, and architectural and cost comparisons with the coming Cassandra 4.0. Topics will include: Improved consistency with our new Lightweight Transactions Scylla Operator for Kubernetes How we stack up against Apache Cassandra 4.0 Our “run anywhere” DynamoDB alternative

Reactive database access with Slick3

takezoe

Presto Summit 2018 - 02 - LinkedIn

kbajda

Mongo DB Monitoring - Become a MongoDB DBA

Severalnines

Introducing TiDB [Delivered: 09/25/18 at Portland Cloud Native Meetup]

Kevin Xu

Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB

ScyllaDB

In this talk AWS’ Ken Krupa, Head of Specialized Solutions Architecture, will describe the architecture and capabilities of two new AWS EC2 instance types perfect for data-intensive storage and IO-heavy workloads like ScyllaDB: the Intel-based I4i and the Graviton2-based I4g series. The Intel Xeon Ice Lake-based I4i series provides unparalleled raw horsepower for your most demanding workloads. Meanwhile, the Graviton2-powered I4g instances provide lower cost per storage on a power-efficient platform to deploy your cloud-native applications. Ken will also describe the AWS Nitro SSD, a new form of high-speed NVMe storage with a Flash Translation Layer built with Nitro controllers, which powers both of these instance families. ScyllaDB VP of Product Tzach Livyatan will then share benchmarking results showing how ScyllaDB behaves under load on these two instance types, providing maximum system utility and efficiency. To watch all of the recordings hosted during Scylla Summit 2022 visit our website here: https://www.scylladb.com/summit.

Running a DynamoDB-compatible Database on Managed Kubernetes Services

ScyllaDB

With the release of Alternator, Scylla’s DynamoDB-compatible API, you can now take your locked-in DynamoDB workloads and run them anywhere. Scylla provides a cost-effective open source alternative to Amazon’s DynamoDB, deployable wherever a user would want: on-premises, on other public clouds like Microsoft Azure or Google Cloud Platform, still on AWS (such as the high-density i3en instances) or as a fully managed DBaaS. In this session, we will cover: - Scylla Alternator: Scylla’s Amazon DynamoDB-compatible API - Scylla Operator: Running Scylla Alternator on Kubernetes - Demo Alternator - Demo and explain DynamoDB on GKE

Introducing TiDB Operator

Kevin Xu

This document discusses PingCAP's Kubernetes operator for TiDB, an open source distributed SQL database. It provides a brief history of PingCAP and the TiDB community. It then gives a technical overview of TiDB's architecture before explaining how the TiDB operator works. The operator allows users to deploy and manage TiDB clusters on Kubernetes through custom resources that are controlled by custom controllers. This provides capabilities like automated scaling, updates, and failover for stateful applications running on Kubernetes. The operator is open source and TiDB is also available as a managed service on GCP Marketplace.

Webinar 2017. Supercharge your analytics with ClickHouse. Alexander Zaitsev

Altinity Ltd

Alexander Zaitsev presented on LifeStreet's experience implementing ClickHouse for their ad analytics platform. Some key points: - LifeStreet processes over 10 billion events per day from their ad exchange and needed a high performance analytics solution. - They tried various databases but migrated fully to ClickHouse due to its performance for analytics workloads. - Major challenges included designing an efficient schema, sharding and replication strategy, and reliable data ingestion. - ClickHouse's dictionary feature allowed them to implement normalized dimensions tables while supporting updates, improving storage efficiency and query performance.

Lookout on Scaling Security to 100 Million Devices

ScyllaDB

TiDB for Big Data

PingCAP

Shen Li, VP engineering at PingCAP, shares the slides about TiDB with the Big Data Ecosystem. Enjoy~ TiDB, an open source distributed HTAP database. Inspired by Google Spanner/F1, PingCAP develops TiDB, an open source distributed Hybrid Transactional/Analytical Processing (HTAP) database. TiDB features infinite horizontal scalability, strong consistency, and high availability. The goal of TiDB is to serve as a one-stop solution for online transactions and analysis.

Webinar slides: Migrating to Galera Cluster for MySQL and MariaDB

Severalnines

This document provides an overview of online and offline migration strategies for migrating from a standalone MySQL or MySQL master-slave setup to a Galera Cluster. It discusses preparation steps like database schema checks and compatibility. It then outlines the process for offline migration using backups and restore, as well as online migration using MySQL replication to sync data between the existing and new Galera clusters before cutting over. Testing strategies like A/B testing in read-only mode are also presented.

Introduction to Data Engineer and Data Pipeline at Credit OK

Kriangkrai Chaonithi

The document discusses the role of data engineers and data pipelines. It begins with an introduction to big data and why data volumes are increasing. It then covers what data engineers do, including building data architectures, working with cloud infrastructure, and programming for data ingestion, transformation, and loading. The document also explains data pipelines, describing extract, transform, load (ETL) processes and batch versus streaming data. It provides an example of Credit OK's data pipeline architecture on Google Cloud Platform that extracts raw data from various sources, cleanses and loads it into BigQuery, then distributes processed data to various applications. It emphasizes the importance of data engineers in processing and managing large, complex data sets.

Webinar slides: Designing Open Source Databases for High Availability

Severalnines

It is said that if you are not designing for failure, then you are heading for failure. How do you design a database system from the ground up to withstand failure? This can be a challenge as failures happen in many different ways, sometimes in ways that would be hard to imagine. This is a consequence of the complexity of today’s database environments. At Severalnines we’re big fans of high availability databases and have seen our fair share of failure scenarios across the thousands of database deployments we enable every year. In this webinar replay, we’ll look at the different types of failures you might encounter and what mechanisms can be used to address them. We will also look at some of popular HA solutions used today, and how they can help you achieve different levels of availability. AGENDA - Why design for High Availability? - High availability concepts - CAP theorem - PACELC theorem - Trade offs - Deployment and operational cost - System complexity - Performance issues - Lock management - Architecting databases for failures - Capacity planning - Redundancy - Load balancing - Failover and switchover - Quorum and split brain - Fencing - Multi datacenter and multi-cloud setups - Recovery policy - High availability solutions - Database architecture determines Availability - Active-Standby failover solution with shared storage or DRBD - Master-slave replication - Master-master cluster - Failover and switchover mechanisms - Reverse proxy - Caching - Virtual IP address - Application connector SPEAKER Ashraf Sharif is System Support Engineer at Severalnines. He was previously involved in hosting world and LAMP stack, where he worked as principal consultant and head of support team and delivered clustering solutions for large websites in the South East Asia region. His professional interests are on system scalability and high availability.

Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...

ScyllaDB

ScyllaDB is a distributed database designed to scale horizontally and vertically — in theory. What about in practice? ScyllaDB’s Benny Halevy, Director, Software Engineering, will take you through the process and results of benchmarking our NoSQL database at the petabyte level, showing how you can use advanced features like workload prioritization to control priorities of transactional (read-write) and analytic (read-only) queries on the same cluster with smooth and predictable performance. To watch all of the recordings hosted during Scylla Summit 2022 visit our website here: https://www.scylladb.com/summit.

Virtual training Intro to the Tick stack and InfluxEnterprise

InfluxData

Challenges in Building a Data Pipeline

Manish Kumar

The document discusses challenges in building a data pipeline including making it highly scalable, available with low latency and zero data loss while supporting multiple data sources. It covers expectations around real-time vs batch processing and streaming vs batch data. Implementation approaches like ETL vs ELT are examined along with replication modes, challenges around schema changes and NoSQL. Effective implementations should address transformations, security, replays, monitoring and more. Reference architectures like Lambda and Kappa are briefly outlined.

What's hot (20)

Presentation at SF Kubernetes Meetup (10/30/18), Introducing TiDB/TiKV

Presto Summit 2018 - 07 - Lyft

Presto Summit 2018 - 09 - Netflix Iceberg

Introducing Scylla Open Source 4.0

Reactive database access with Slick3

Presto Summit 2018 - 02 - LinkedIn

Mongo DB Monitoring - Become a MongoDB DBA

Introducing TiDB [Delivered: 09/25/18 at Portland Cloud Native Meetup]

Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB

Running a DynamoDB-compatible Database on Managed Kubernetes Services

Introducing TiDB Operator

Webinar 2017. Supercharge your analytics with ClickHouse. Alexander Zaitsev

Lookout on Scaling Security to 100 Million Devices

TiDB for Big Data

Webinar slides: Migrating to Galera Cluster for MySQL and MariaDB

Introduction to Data Engineer and Data Pipeline at Credit OK

Webinar slides: Designing Open Source Databases for High Availability

Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...

Virtual training Intro to the Tick stack and InfluxEnterprise

Challenges in Building a Data Pipeline

Similar to Introducing TiDB - Percona Live Frankfurt

Introducing TiDB @ SF DevOps Meetup

Kevin Xu

This document introduces TiDB, an open source distributed SQL database developed by PingCAP. It provides a 3-part summary: 1) TiDB is a hybrid transactional/analytical database inspired by Google Spanner/F1 that provides horizontal scalability, MySQL compatibility, and ACID transactions. It consists of TiDB, TiKV, and Placement Driver. 2) Mobike, a bike sharing platform with 200 million users, uses TiDB to power operations like bike locking/unlocking tracking and real-time analytics to handle high concurrency and permanent storage needs. 3) Over 200 companies use TiDB for two major uses - MySQL scalability and hybrid OLTP/OLAP architecture

Introducing TiDB Operator [Cologne, Germany]

Kevin Xu

Scale Relational Database with NewSQL

PingCAP

"Smooth Operator" [Bay Area NewSQL meetup]

Kevin Xu

TiDB as an HTAP Database

PingCAP

TiDB is a NewSQL database that provides horizontal scalability, ACID transactions, high availability, and SQL support. It aims to be an HTAP (Hybrid Transactional/Analytical Processing) database by supporting both OLTP and OLAP workloads on the same database using the same SQL interface. TiDB achieves horizontal scalability through its distributed architecture with the TiKV storage engine and PD for metadata management. It supports ACID transactions through MVCC and Raft consensus. The database is available through replication of regions across nodes. TiDB also supports real-time analytics on the same dataset as transactions through its cost-based optimizer and distributed query processing engine. Spark can run queries directly against the

A Brief Introduction of TiDB (Percona Live)

PingCAP

TiDB is an open-source distributed SQL database that supports high availability, horizontal scalability, and consistent distributed transactions. It provides a MySQL compatible API and seamless online expansion. TiDB uses Raft for consensus and implements the MVCC model to support high concurrency. It also provides distributed transactions through a two-phase commit protocol. The architecture consists of a stateless SQL layer (TiDB) and a distributed transactional key-value storage (TiKV).

When Apache Spark Meets TiDB with Xiaoyu Ma

Databricks

During the past 10 years, big-data storage layers mainly focus on analytical use cases. When it comes to analytical cases, users usually offload data onto Hadoop cluster and perform queries on HDFS files. People struggle dealing with modifications on append only storage and maintain fragile ETL pipelines. On the other hand, although Spark SQL has been proven effective parallel query processing engine, some tricks common in traditional databases are not available due to characteristics of storage underneath. TiSpark sits directly on top of a distributed database (TiDB)’s storage engine, expand Spark SQL’s planning with its own extensions and utilizes unique features of database storage engine to achieve functions not possible for Spark SQL on HDFS. With TiSpark, users are able to perform queries directly on changing / fresh data in real time. The takeaways from this two are twofold: — How to integrate Spark SQL with a distributed database engine and the benefit of it — How to leverage Spark SQL’s experimental methods to extend its capacity.

TiDB + Mobike by Kevin Xu (@kevinsxu)

Kevin Xu

TiDB DevCon 2020 Opening Keynote

PingCAP

TiDB vs Aurora.pdf

ssuser3fb50b

TiDB and Amazon Aurora can be combined to provide analytics on transactional data without needing a separate data warehouse. TiDB Data Migration (DM) tool allows migrating and replicating data from Aurora into TiDB for analytics queries. DM provides full data migration and incremental replication of binlog events from Aurora into TiDB. This allows joining transactional and analytical workloads on the same dataset without needing ETL pipelines.

Keynote -- Percona Live Europe 2018

Kevin Xu

DPDK Summit 2015 - NTT - Yoshihiro Nakajima

Jim St. Leger

DPDK summit 2015: It's kind of fun to do the impossible with DPDK

Lagopus SDN/OpenFlow switch

The document discusses using Lagopus software-defined networking (SDN) switches to demonstrate an SDN internet exchange (IX) at the Interop Tokyo 2015 technology show. Key points: - Two Lagopus SDN switches were deployed as the core switches in an SDN IX to enable automated provisioning of inter-autonomous system layer 2 connectivity and on-demand packet filtering between internet service providers. - The Lagopus switches achieved an average throughput of 2Gbps with no packet drops over a week during the show, demonstrating the potential for software switches in next-generation SDNs. - Previous work to optimize the Lagopus switch performance through techniques like hardware offloading to FPGAs helped enable its

Data-at-scale-with-TIDB Mydbops Co-Founder Kabilesh PR at LSPE Event

Mydbops

Explore the world of TiDB with Kabilesh PR, Co-Founder of Mydbops, as he unveils the potential of this open-source distributed SQL database. Dive into the architecture, scalability solutions, and production readiness of TiDB, and discover how it addresses MySQL scalability bottlenecks through sharding. Gain insights into its stateless SQL interface, transactional storage with TiKV, and analytical capabilities with TiFlash. Learn about TiDB's native sharding features, use cases across various industries, and its readiness for production environments. Delve into its limitations and discover how TiDB can transform your data management landscape.

MySQL Transformation Case Study: 80% Cost Savings & Uninterrupted Availabilit...

Mydbops

Discover how Mydbops achieved an impressive 80% cost savings and ensured uninterrupted availability through a transformative MySQL database case study. Join Vinoth Kanna RS, Co-Founder of Mydbops, as he shares insights into optimizing infrastructure, enhancing observability, and navigating critical technology decisions. Learn from real-world challenges, innovative solutions, and valuable takeaways for your own database management endeavors.

Titan and Cassandra at WellAware

twilmes

The Nitty Gritty of Advanced Analytics Using Apache Spark in Python

Miklos Christine

Apache Spark is the next big data processing tool for Data Scientist. As seen on the recent StackOverflow analysis, it's the hottest big data technology on their site! In this talk, I'll use the PySpark interface to leverage the speed and performance of Apache Spark. I'll focus on the end to end workflow for getting data into a distributed platform, and leverage Spark to process the data for advanced analytics. I'll discuss the popular Spark APIs used for data preparation, SQL analysis, and ML algorithms. I'll explain the performance differences between Scala and Python, and how Spark has bridged the gap in performance. I'll focus on PySpark as the interface to the platform, and walk through a demo to showcase the APIs. Talk Overview: Spark's Architecture. What's out now and what's in Spark 2.0Spark APIs: Most common APIs used by Spark Common misconceptions and proper techniques for using Spark. Demo: Walk through ETL of the Reddit dataset. SparkSQL Analytics + Visualizations of the Dataset using MatplotLibSentiment Analysis on Reddit Comments

RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro

PyData

Presto @ Zalando - Big Data Tech Warsaw 2020

Piotr Findeisen

SQREAM DB on IBM Power9

Ganesan Narayanasamy

GPU data warehouse Sqream DB provides a massively parallel processing engine powered by GPUs that is faster and more efficient than CPU-based systems. It can ingest terabytes of data per hour onto a single GPU and handle petabytes of data stored in a compact 2U server. With familiar SQL queries and connectors, Sqream DB accelerates analytics by 100x over traditional warehouses through its GPU-accelerated processing and columnar storage.

Similar to Introducing TiDB - Percona Live Frankfurt (20)

Introducing TiDB @ SF DevOps Meetup

Introducing TiDB Operator [Cologne, Germany]

Scale Relational Database with NewSQL

"Smooth Operator" [Bay Area NewSQL meetup]

TiDB as an HTAP Database

A Brief Introduction of TiDB (Percona Live)

When Apache Spark Meets TiDB with Xiaoyu Ma

TiDB + Mobike by Kevin Xu (@kevinsxu)

TiDB DevCon 2020 Opening Keynote

TiDB vs Aurora.pdf

Keynote -- Percona Live Europe 2018

DPDK Summit 2015 - NTT - Yoshihiro Nakajima

DPDK summit 2015: It's kind of fun to do the impossible with DPDK

Data-at-scale-with-TIDB Mydbops Co-Founder Kabilesh PR at LSPE Event

MySQL Transformation Case Study: 80% Cost Savings & Uninterrupted Availabilit...

Titan and Cassandra at WellAware

The Nitty Gritty of Advanced Analytics Using Apache Spark in Python

RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro

Presto @ Zalando - Big Data Tech Warsaw 2020

SQREAM DB on IBM Power9

More from Morgan Tocker

Introducing Spirit - Online Schema Change

Introducing TiDB - Percona Live Frankfurt

Related slideshows

More Related Content

What's hot

What's hot (20)

Similar to Introducing TiDB - Percona Live Frankfurt

Similar to Introducing TiDB - Percona Live Frankfurt (20)

More from Morgan Tocker

More from Morgan Tocker (20)

Recently uploaded

Recently uploaded (20)

Introducing TiDB - Percona Live Frankfurt