Graham Mainwaring and Robert Hodges summarize management of ClickHouse on Kubernetes using the ClickHouse Kubernetes Operator and introduce a new UI for it. Presented at the 15 Dec '22 SF Bay Area ClickHouse Meetup.
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...Altinity Ltd
Presented at the webinar, July 31, 2019
Built-in replication is a powerful ClickHouse feature that helps scale data warehouse performance as well as ensure high availability. This webinar will introduce how replication works internally, explain configuration of clusters with replicas, and show you how to set up and manage ZooKeeper, which is necessary for replication to function. We'll finish off by showing useful replication tricks, such as utilizing replication to migrate data between hosts. Join us to become an expert in this important subject!
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesAltinity Ltd
From webinars September 11 and September 17, 2019
ClickHouse is famous for speed. That said, you can almost always make it faster! This webinar uses examples to teach you how to deduce what queries are actually doing by reading the system log and system tables. We'll then explore standard ways to increase query speed: data types and encodings, filtering, join reordering, skip indexes, materialized views, session parameters, to name just a few. In each case we'll circle back to query plans and system metrics to demonstrate changes in ClickHouse behavior that explain the boost in performance. We hope you'll enjoy the first step to becoming a ClickHouse performance guru!
Speaker Bio:
Robert Hodges is CEO of Altinity, which offers enterprise support for ClickHouse. He has over three decades of experience in data management spanning 20 different DBMS types. ClickHouse is his current favorite. ;)
Practical Partitioning in Production with PostgresEDB
Has your table become too large to handle? Have you thought about chopping it up into smaller pieces that are easier to query and maintain? What if it's in constant use? An introduction to the problems that can arise and how PostgreSQL's partitioning features can help, followed by a real-world scenario of partitioning an existing huge table on a live system. We will be looking at the problems caused by having very large tables in your database and how declarative table partitioning in Postgres can help. Also, how to perform dimensioning before but also after creating huge tables, partitioning key selection, the importance of upgrading to get the latest Postgres features and finally we will dive into a real-world scenario of having to partition an existing huge table in use on a production system.
Alexander Sapin from Yandex presents reasoning, design considerations, and implementation of ClickHouse Keeper. It replaces ZooKeeper in ClickHouse clusters, thereby simplifying operation enormously.
ClickHouse Materialized Views: The Magic ContinuesAltinity Ltd
Slides for the webinar, presented on February 26, 2020
By Robert Hodges, Altinity CEO
Materialized views are the killer feature of ClickHouse, and the Altinity 2019 webinar on how they work was very popular. Join this updated webinar to learn how to use materialized views to speed up queries hundreds of times. We'll cover basic design, last point queries, using TTLs to drop source data, counting unique values, and other useful tricks. Finally, we'll cover recent improvements that make materialized views more useful than ever.
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...HostedbyConfluent
In a real-time data ingestion pipeline for analytical processing, efficient and fast data loading to a columnar database such as ClickHouse favors large blocks over individual rows. Therefore, applications often rely on some buffering mechanism such as Kafka to store data temporarily, and having a message processing engine to aggregate Kafka messages into large blocks which then get loaded to the backend database. Due to various failures in this pipeline, a naive block aggregator that forms blocks without additional measures, would cause data duplication or data loss. We have developed a solution to avoid these issues, thereby achieving exactly-once delivery from Kafka to ClickHouse. Our solution utilizes Kafka’s metadata to keep track of blocks that we intend to send to ClickHouse, and later uses this metadata information to deterministically re-produce ClickHouse blocks for re-tries in case of failures. The identical blocks are guaranteed to be deduplicated by ClickHouse. We have also developed a run-time verification tool that monitors Kafka’s internal metadata topic, and raises alerts when the required invariants for exactly-once delivery are violated. Our solution has been developed and deployed to the production clusters that span multiple datacenters at eBay.
ClickHouse Monitoring 101: What to monitor and howAltinity Ltd
Webinar. Presented by Robert Hodges and Ned McClain, April 1, 2020
You are about to deploy ClickHouse into production. Congratulations! But what about monitoring? In this webinar we will introduce how to track the health of individual ClickHouse nodes as well as clusters. We'll describe available monitoring data, how to collect and store measurements, and graphical display using Grafana. We'll demo techniques and share sample Grafana dashboards that you can use for your own clusters.
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfAltinity Ltd
Join the Altinity experts as we dig into ClickHouse sharding and replication, showing how they enable clusters that deliver fast queries over petabytes of data. We’ll start with basic definitions of each, then move to practical issues. This includes the setup of shards and replicas, defining schema, choosing sharding keys, loading data, and writing distributed queries. We’ll finish up with tips on performance optimization.
#ClickHouse #datasets #ClickHouseTutorial #opensource #ClickHouseCommunity #Altinity
-----------------
Join ClickHouse Meetups: https://www.meetup.com/San-Francisco-...
Check out more ClickHouse resources: https://altinity.com/resources/
Visit the Altinity Documentation site: https://docs.altinity.com/
Contribute to ClickHouse Knowledge Base: https://kb.altinity.com/
Join the ClickHouse Reddit community: https://www.reddit.com/r/Clickhouse/
----------------
Learn more about Altinity!
Site: https://www.altinity.com
LinkedIn: https://www.linkedin.com/company/alti...
Twitter: https://twitter.com/AltinityDB
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustAltinity Ltd
This document discusses shipping data from PostgreSQL to ClickHouse using logical replication. It explains that logical replication in PostgreSQL replicates only DML statements, while physical replication replicates the entire database. It describes using pg2ch to replicate changes from PostgreSQL to ClickHouse tables using CollapsingMergeTree, ReplacingMergeTree, or MergeTree engines. Pg2ch accumulates changes in buffers and handles updating/deleting rows when using CollapsingMergeTree or ReplacingMergeTree engines.
ClickHouse on Kubernetes! By Robert Hodges, Altinity CEOAltinity Ltd
Slides from Webinar. April 16, 2019
Data services are the latest wave of applications to catch the Kubernetes bug. Altinity is pleased to introduce the ClickHouse operator, which makes it easy to run scalable data warehouses on your favorite Kubernetes distro. This webinar shows how to install the operator and bring up a new data warehouse in three simple steps. We also cover storage management, monitoring, making config changes, and other topics that will help you operate your data warehouse successfully on Kubernetes. There is time for demos and Q&A, so bring your questions. See you online!
Speaker Bio:
Robert Hodges is CEO of Altinity, which offers enterprise support for ClickHouse. He has over three decades of experience in data management spanning 20 different DBMS types. ClickHouse is his current favorite. ;)
This document provides an introduction and overview of PostgreSQL, including its history, features, installation, usage and SQL capabilities. It describes how to create and manipulate databases, tables, views, and how to insert, query, update and delete data. It also covers transaction management, functions, constraints and other advanced topics.
Altinity Cluster Manager: ClickHouse Management for Kubernetes and CloudAltinity Ltd
Webinar. August 21, 2019
By Robert Hodges and Altinity Engineering Team
Simplified management is a prerequisite for running any data warehouse at scale. Altinity is developing a new web-based console for ClickHouse called the Altinity Cluster Manager. It's now in beta and offers simplified operation of ClickHouse installations for users. In this webinar we introduce the ACM and demonstrate use on Kubernetes as well as Amazon Web Services. Attendees are welcome to sign up as beta testers and provide feedback. Please join us to see the future of Clickhouse management!
In 40 minutes the audience will learn a variety of ways to make postgresql database suddenly go out of memory on a box with half a terabyte of RAM.
Developer's and DBA's best practices for preventing this will also be discussed, as well as a bit of Postgres and Linux memory management internals.
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesAltinity Ltd
Slides for the Webinar, presented on March 6, 2019
For the webinar video visit https://www.altinity.com/
Extracting business insight from massive pools of machine-generated data is the central analytic problem of the digital era. ClickHouse data warehouse addresses it with sub-second SQL query response on petabyte-scale data sets. In this talk we'll discuss the features that make ClickHouse increasingly popular, show you how to install it, and teach you enough about how ClickHouse works so you can try it out on real problems of your own. We'll have cool demos (of course) and gladly answer your questions at the end.
Speaker Bio:
Robert Hodges is CEO of Altinity, which offers enterprise support for ClickHouse. He has over three decades of experience in data management spanning 20 different DBMS types. ClickHouse is his current favorite. ;)
Apache Calcite (a tutorial given at BOSS '21)Julian Hyde
The document provides instructions for setting up the environment and coding tutorial for the BOSS'21 Copenhagen tutorial on Apache Calcite.
It includes the following steps:
1. Clone the GitHub repository containing sample code and dependencies.
2. Compile the project.
3. It outlines the draft schedule for the tutorial, which will cover topics like Calcite introduction, demonstration of SQL queries on CSV files, setting up the coding environment, using Lucene for indexing, and coding exercises to build parts of the logical and physical query plans in Calcite.
4. The tutorial will be led by Stamatis Zampetakis from Cloudera and Julian Hyde from Google, who are both committers to
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOAltinity Ltd
1. ClickHouse uses a MergeTree storage engine that stores data in compressed columnar format and partitions data into parts for efficient querying.
2. Query performance can be optimized by increasing threads, reducing data reads through filtering, restructuring queries, and changing the data layout such as partitioning strategy and primary key ordering.
3. Significant performance gains are possible by optimizing the data layout, such as keeping an optimal number of partitions, using encodings to reduce data size, and skip indexes to avoid unnecessary I/O. Proper indexes and encodings can greatly accelerate queries.
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfAltinity Ltd
Robert Hodges from Altinity, an enterprise provider of ClickHouse and developer of the ClickHouse Kubernetes operator, provides an introduction to running ClickHouse on Kubernetes. The presentation demonstrates how to deploy ClickHouse and Zookeeper on Kubernetes using the ClickHouse Kubernetes operator. It shows how to define ClickHouse installations using custom resources, access ClickHouse, and update the cluster configuration, such as changing the number of shards and replicas or ClickHouse version. The operator automatically applies configuration changes to pods.
Introduction to Jenkins X - a beginner's guideAndrew Bayer
The document provides an overview of Jenkins X, including:
- What Jenkins X is and how it differs from Jenkins
- How to install Jenkins X using the CLI
- Capabilities like quickstarts, build packs, pipelines, environments and ChatOps
- When Jenkins X would be suitable compared to manually configuring Kubernetes
- Next steps to learn more about Jenkins X
Deploying Windows Apps to Kubernetes with Draft and HelmJessica Deen
This document provides an overview and summary of a presentation about deploying Windows applications with Kubernetes, Draft, and Helm. It begins with introductions and disclaimers about the hands-on lab. It then discusses the history of building Kubernetes on Windows and mixed clusters. The presentation demonstrates deploying applications across Windows and Linux nodes using kubectl and shows how to consider Windows-specific aspects like resource limits and node selection. It also covers Helm charts for application deployment and management and using Draft to simplify and automate application development and deployment to Kubernetes.
This document provides an introduction and agenda for a two-day training on Kubernetes. Day one will cover Kubernetes concepts like pods, services, replica sets, deployments and namespaces. It will also include hands-on exercises. Day two will focus on additional concepts like config maps, secrets, auto-scaling and Helm. It will end with further hands-on experience and conclusions.
This document outlines the steps to run Kubernetes locally, including required installations like Java 8, Maven, Git, Kubernetes CLI (kubectl), Minikube, and Docker. It discusses benefits like cloud-native development and testing applications locally before deploying to cloud providers. The steps covered include starting Minikube, building and pushing a Docker image to Minikube's registry, deploying microservices interactively with kubectl or declaratively with YAML files, exposing services, and testing before stopping Minikube.
OSS Japan 2019 service mesh bridging Kubernetes and legacySteve Wong
how to join legacy VMs and bare metal machines to a Kubernetes service mesh so that VMs can consume Kubernetes services AND publish services used by Kubernetes hosted applications
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Ltd
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data - Presentation Slides
Altinity.Cloud is a fully automated cloud service for ClickHouse that is optimized for real-time analytics.
In this webinar, we’ll explain how Altinity.Cloud works, then show how to set up your first ClickHouse cluster. We’ll then tour important features like scale-up, scale-out, uptime schedules, and DBA tools to analyze your tables.
You’ll learn everything necessary to start working on real-time analytics today.
Bring your questions!
Presenters: Robert Hodges & Alexander Zaitsev
Note: This webinar will be recorded and later posted on our Webinar page (https://altinity.com/webinarspage/) or Altinity official Youtube channel (https://www.youtube.com/@Altinity).
DevOpsCon Berlin: Helm vs Operators – Do I Need to Decide?Nico Meisenzahl
In this session, Nico will talk about Helm and Operators as well as the battle between them. If you ever ask yourself one of the following questions, this talk is the right one for you:
"Do I have to decide?"
"Do I need to learn both?"
"What are the best tools for my use-case?"
In short: There is no battle. Both tools have different scopes. Nico will talk about the pros and cons, and what the projects themselves focus on. Join his talk to learn more!
Data warehouse on Kubernetes - gentle intro to Clickhouse Operator, by Robert...Altinity Ltd
San Diego Cloud Native Computing Meetup, January 23, 2020
Presented by Robert Hodges, Altinity CEO
Data services are the latest wave of applications to catch the Kubernetes bug, but how many people would guess that includes data warehouses? We proved it works by developing the ClickHouse Kubernetes operator, which is now in production use at companies like Mux.com. It's an open source operator to stand up and run ClickHouse, a popular Apache 2.0 data warehouse that can return queries on trillions of rows in seconds or less. This talk introduces ClickHouse and shows why it's a 'cloud friendly' DBMS. We'll go mano-a-mano with the ClickHouse operator, showing how you can spin up data warehouses in 60 seconds or less. We'll cover issues like storage management, monitoring and upgrade. In short, everything you need to know to try running your own ClickHouse data warehouses on Kubernetes.
This document provides an agenda and instructions for learning Kubernetes in 90 minutes. The agenda includes exercises on running a first web service in Kubernetes, revisiting pods, deployments and services, deploying with YAML files, and installing a microservices application called Guestbook. Key Kubernetes concepts covered include pods, deployments, services, YAML descriptors, and using deployments to scale applications. The document also provides background on containers, Docker, and the Kubernetes architecture.
KubeOne is an open source tool for managing the lifecycle of Kubernetes clusters, including installing, upgrading, and decommissioning clusters on major cloud providers and on-premises. It uses tools like kubeadm and Kubermatic machine-controller to provision clusters in a declarative way. The presentation demonstrates installing a highly available Kubernetes cluster on AWS using KubeOne by defining a cluster manifest and running the install command.
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfAltinity Ltd
The document discusses Altinity.Cloud Anywhere, a service that allows users to run ClickHouse databases on their own Kubernetes clusters. It provides automation of ClickHouse operations and management through the Altinity Connector. Users can prepare their own Kubernetes environment, connect it to Altinity.Cloud, and then launch and manage ClickHouse clusters on their infrastructure. Advanced topics covered include how the service works internally and how to get support from Altinity.
ClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTOAltinity Ltd
The document discusses Altinity and their ClickHouse operator for Kubernetes. Altinity provides software and services for ClickHouse including 24/7 support, Kubernetes software, and training. The ClickHouse operator allows users to deploy and manage ClickHouse clusters on Kubernetes through declarative YAML configurations. It handles complex tasks like provisioning, networking, and persistence so that ClickHouse can be easily deployed and managed on Kubernetes.
The document discusses various Kubernetes concepts including pods, deployments, services, ingress, labels, health checks, config maps, secrets, volumes, autoscaling, resource quotas, namespaces, Helm, and the Kubernetes Dashboard. Kubernetes is a container orchestration tool that manages container deployment, scaling, and networking. It uses pods to group containers, deployments to manage pods, and services for exposing applications.
Kubespray and Ansible can be used to automate the installation of Kubernetes in a production-ready environment. Kubespray provides tools to configure highly available Kubernetes clusters across multiple Linux distributions. Ansible is an IT automation tool that can deploy software and configure systems. The document then provides a 6 step guide for installing Kubernetes on Ubuntu using kubeadm, including installing Docker, kubeadm, kubelet and kubectl, disabling swap, configuring system parameters, initializing the cluster with kubeadm, and joining nodes. It also briefly explains Kubernetes architecture including the master node, worker nodes, addons, CNI, CRI, CSI and key concepts like pods, deployments, networking,
Installing and Using Kubernetes is hard, but Operating Kubernetes is even harder! This BOF is for Kubernetes Operators to get together and discuss our day to day Operations, and for people new to Kubernetes to learn more about how to operate it.
This document discusses running CI/CD pipelines with VMware Cloud PKS using Jenkins X. It provides an overview of Jenkins X and how it extends Kubernetes with custom resource definitions. Jenkins X allows automation of CI/CD pipelines through a single CLI interface and includes features like GitOps promotion of applications between environments. The document also compares Jenkins X to Jenkins and notes how Jenkins X is focused specifically on Kubernetes and implementing best practices for native Kubernetes CI/CD.
Almost 3 years with Kubernetes and some "war stories", we will take the top-down approach to kubernetes and take a glimpse of the bottom-up and where we could customize it.
This document introduces the CitusTM IoT Ecosystem, which allows users to develop and integrate IoT products, visualize sensor data, and build sharing economy business models on a centralized platform. It can be deployed on dedicated or shared infrastructure using Docker Compose, Kubernetes, or AWS CloudFormation. The ecosystem provides services for device management, sensor analytics, recognition applications, and more through container-based microservices that can be easily deployed and shared across users. Setup instructions are included to deploy the ecosystem locally using Docker Compose or on AWS using a CloudFormation template.
Mattia Gandolfi - Improving utilization and portability with Containers and C...Codemotion
Google has pioneered the usage of containers at huge scale. Learn how we designed our systems to handle insane traffic loads, orchestrating complex, globally distributed applications, and how you can leverage this infrastructure and our agile development technologies to embrace the power of DevOps and Cloud on our Google Cloud Platform.
Similar to Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI (20)
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxAltinity Ltd
Building an Analytic Extension to MySQL with ClickHouse and Open Source
In this webinar Percona and Altinity offer suggestions and tips on how to recognize when MySQL is overburdened with analytics and can benefit from ClickHouse’s unique capabilities.
Also, they will walk you through important patterns for integrating MySQL and ClickHouse which will enable the building of powerful and cost-efficient applications that leverage the strengths of both databases.
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Altinity Ltd
Over the last few years Kubernetes has transitioned from an object of curiosity and fear to a robust platform for big data. Watch this webinar and you will learn how the Altinity Kubernetes Operator for ClickHouse enables users to run high performance analytics on ClickHouse. You will see a simple installation and teach you how to scale it into a cluster that can analyze 100s of terabytes of data. Along the way we’ll share our lessons for ClickHouse on Kubernetes in Altinity.Cloud. We built it on Kubernetes using the Altinity Operator and now run hundreds of clusters in the cloud. You can too!
Building an Analytic Extension to MySQL with ClickHouse and Open SourceAltinity Ltd
This is a joint webinar Percona - Altinity.
In this webinar we will discuss suggestions and tips on how to recognize when MySQL is overburdened with analytics and can benefit from ClickHouse’s unique capabilities.
We will then walk through important patterns for integrating MySQL and ClickHouse which will enable the building of powerful and cost-efficient applications that leverage the strengths of both databases.
Fun with ClickHouse Window Functions-2021-08-19.pdfAltinity Ltd
Fun with ClickHouse Window Functions | Altinity Webinar
Window functions have arrived in ClickHouse!
Our webinar will start with an introduction to standard window function syntax and show how it is implemented in ClickHouse. We’ll next show you problems that you can now solve easily using window functions. Finally, we’ll compare window functions to arrays, another powerful ClickHouse feature.
There will be time for questions with our SQL experts.
Join us for a complete overview of this long-awaited feature!
Speakers:
Robert Hodges, CEO @Altinity
Vitaliy Zakaznikov, QA Manager and Architect @Altinity
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Altinity Ltd
Altinity Stable Builds offer a ClickHouse distribution that is ready for production use and with 3 years of maintenance. Our webinar introduces the special features of Stable Builds and describes how we build them from ClickHouse Long-Term Support (LTS) releases. We’ll show you how to find them and install them yourself, then guide you through the important topic of upgrading. We’ll also walk through how to use Altinity Stable Builds in Altinity.Cloud, our managed ClickHouse platform for high-performance analytics.
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Altinity Ltd
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHouse Webinar Slides
Monitoring is the key to the successful operation of any software service, but commercial solutions are complex, expensive, and slow. Let us show you how to build monitoring that is simple, cost-effective, and fast using open-source stacks easily accessible to any developer.
We’ll start with the elements of monitoring systems: data ingest, query engine, visualization, and alerting. We’ll then explain and contrast two implementation approaches. The first uses VictoriaMetrics, a fast-growing, high-performance time series database that uses PromQL for queries. The second is based on ClickHouse, a popular real-time analytics database that speaks SQL. Fast, affordable monitoring is within reach. This webinar provides designs and working code to get you there.
Presented by:
Roman Khavronenko, Co-Founder at VictoriaMetrics
Robert Hodges, CEO at Altinity
ClickHouse ReplacingMergeTree in Telecom AppsAltinity Ltd
Alexandr Dubovikov of QXIP explains how to use ClickHouse ReplacingMergeTree engine for an important Telecom use case: tracking state of calls from incoming call detail records aka CDRs. (https://www.meetup.com/san-francisco-bay-area-clickhouse-meetup/events/289605843/)
Adventures with the ClickHouse ReplacingMergeTree EngineAltinity Ltd
Presentation on ReplacingMergeTree by Robert Hodges of Altinity at the 14 December 2022 SF Bay Area ClickHouse Meetup (https://www.meetup.com/san-francisco-bay-area-clickhouse-meetup/events/289605843/)
Building a Real-Time Analytics Application with Apache Pulsar and Apache PinotAltinity Ltd
This document provides an overview of building a real-time analytics application with Apache Pulsar and Apache Pinot. It introduces Mary Grygleski and Mark Needham, describes what real-time analytics is, and discusses the properties of real-time analytics systems. It then demonstrates how to ingest data from the Wikimedia recent changes feed into Pulsar and Pinot for real-time analytics and builds a dashboard with the data using Streamlit.
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...Altinity Ltd
OSA Con 2022: What Data Engineering Can Learn from Frontend Engineering
Pete Hunt - Elementl
Frontend engineering went through a revolution in the last decade. I'll recap what happened, and how a similar revolution started in data engineering.
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfAltinity Ltd
OSA Con 2022: Welcome to OSA CON Version 2022
Robert Hodges - Altinity
Join us as we guide you through the conference and highlight the many presenters who are contributing talks.
We'll also include a few tips about how to use the conference platform.
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...Altinity Ltd
OSA Con 2022: Using ClickHouse Database to Power Analytics and Customer Engagement Platform
Prafulla Gupta - Times Internet
This talk covers how we empowered Product Managers and Editors at Times Internet by developing an in-house product, GrowthRx, using Clickhouse Open Source Database to track and analyze user behavior to increase user retention and customer engagement. Times Internet is India's largest digital news publisher, which manages leading brands like Times of India, Economic Times, Navbharat Times, etc, where we are tracking more than 10 billion events per month in the ClickHouse Database.
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...Altinity Ltd
- ClickHouse can query 170 billion rows at 500 queries per second with a 99th percentile latency of 110ms through careful data modeling, query optimization, and use of materialized views.
- To achieve low latency at high query rates, it is important to reduce the amount of data scanned by queries through techniques like sorting keys, data compression, and reducing data cardinality.
- Materialized views can reduce data sizes by 1000-10,000x and are critical for maintaining low query latencies on large datasets. Dividing data into read and write replicas also improves query performance.
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...Altinity Ltd
OSA Con 2022: The Open Source Analytic Universe, Version 2022
Robert Hodges - Altinity
Every generation builds new cathedrals. For many of us, this means implementing analytic applications built on a foundation of open source.
We'll survey developments in analytics since the last OSA Con and highlight new technologies that developers should be watching as we head into the mid-2020s.
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...Altinity Ltd
The document discusses how OpsVerse migrated their Jaeger distributed tracing storage from Cassandra to ClickHouse for improved performance monitoring. Jaeger is an open source distributed tracing system that was originally designed to use Elasticsearch or Cassandra for storage. While Cassandra worked well for basic functionality, it lacked capabilities for advanced analytics. ClickHouse supports richer query functions and better handles large datasets. The document outlines the steps OpsVerse took to implement the ClickHouse storage plugin for Jaeger and deploy ClickHouse on Kubernetes using the ClickHouse Operator. This migration enabled more insightful performance monitoring and analytics.
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...Altinity Ltd
OSA Con 2022: Streaming Data Made Easy
Tim Spann & David Kjerrumgaard - StreamNative
Click into new streaming applications the easy way with Apache Pulsar, Clickhouse, and Open Source. A quick introduction to how to build modern data streaming applications.
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfAltinity Ltd
OSA Con 2022 - State of Open Source Databases
Peter Zaitsev - Percona
It has been an exciting year in the open-source database industry, with more choices, more cloud, and key changes in the industry. We will dive into the key developments over 2022, including the most important open-source database software releases in general, the significance of cloud-native solutions in a multi-vendor multi-cloud world, the new criticality of security challenges, and the evolution of the open-source software industry.
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...Altinity Ltd
OSA Con 2022: Specifics of data analysis in Time Series Databases
Roman Khavronenko - VictoriaMetrics
Time series data is special. Not only its nature but also the ways that we store and interact with it.
In this talk, we'll cover the differences between storing time series data in classic relational databases
and a new generation of time series databases like VictoriaMetrics and Prometheus.
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...Altinity Ltd
OSA Con 2022: Signal Correlation, the Ho11y Grail
Michael Hausenblas - AWS.pdf
Michael shows how the signal correlation in observability use cases helps you to spot issues faster, optimize code, or make you more productive in delivering features.
OSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdfAltinity Ltd
OSA Con 2022: Scaling your Pandas Analytics with Modin
Doris Lee - Ponder
Pandas is one of the most commonly used data science libraries in Python, with a convenient set of APIs for data cleaning, visualization, analysis, and exploration. However, despite its widespread adoption, Pandas suffers from severe scalability issues on large datasets. We developed the open-source project Modin, which is a fast, scalable drop-in replacement for pandas. Modin has been downloaded more than 4 million times and is used by leading data science teams, including Fortune 100 companies.
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA MATKA RESULT KALYAN MATKA TIPS SATTA MATKA MATKA COM MATKA PANA JODI TODAY
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA MATKA RESULT KALYAN MATKA TIPS SATTA MATKA MATKA COM MATKA PANA JODI TODAY
LLM powered contract compliance application which uses Advanced RAG method Self-RAG and Knowledge Graph together for the first time.
It provides highest accuracy for contract compliance recorded so far for Oil and Gas Industry.