Graham Mainwaring and Robert Hodges summarize management of ClickHouse on Kubernetes using the ClickHouse Kubernetes Operator and introduce a new UI for it. Presented at the 15 Dec '22 SF Bay Area ClickHouse Meetup.
All about Zookeeper and ClickHouse Keeper.pdfAltinity Ltd
ClickHouse clusters depend on ZooKeeper to handle replication and distributed DDL commands. In this Altinity webinar, we’ll explain why ZooKeeper is necessary, how it works, and introduce the new built-in replacement named ClickHouse Keeper. You’ll learn practical tips to care for ZooKeeper in sickness and health. You’ll also learn how/when to use ClickHouse Keeper. We will share our recommendations for keeping that happy as well.
Better than you think: Handling JSON data in ClickHouseAltinity Ltd
Robert Hodges shows how ClickHouse, a relational database with tables, can offer high-performance analysis of JSON data. This talk provides a cookbook of schema design, indexing, data loading, and query tricks we gave learned over years of helping users build analytical apps for servicds logs, observability data, financial transactions, and other types of semi-structured data. Robert Hodges is CEO of Altinity and a certified database geek.
https://altinity.com
https://www.meetup.com/San-Francisco-Bay-Area-ClickHouse-Meetup
Altinity Cluster Manager: ClickHouse Management for Kubernetes and CloudAltinity Ltd
Webinar. August 21, 2019
By Robert Hodges and Altinity Engineering Team
Simplified management is a prerequisite for running any data warehouse at scale. Altinity is developing a new web-based console for ClickHouse called the Altinity Cluster Manager. It's now in beta and offers simplified operation of ClickHouse installations for users. In this webinar we introduce the ACM and demonstrate use on Kubernetes as well as Amazon Web Services. Attendees are welcome to sign up as beta testers and provide feedback. Please join us to see the future of Clickhouse management!
ClickHouse on Kubernetes! By Robert Hodges, Altinity CEOAltinity Ltd
Slides from Webinar. April 16, 2019
Data services are the latest wave of applications to catch the Kubernetes bug. Altinity is pleased to introduce the ClickHouse operator, which makes it easy to run scalable data warehouses on your favorite Kubernetes distro. This webinar shows how to install the operator and bring up a new data warehouse in three simple steps. We also cover storage management, monitoring, making config changes, and other topics that will help you operate your data warehouse successfully on Kubernetes. There is time for demos and Q&A, so bring your questions. See you online!
Speaker Bio:
Robert Hodges is CEO of Altinity, which offers enterprise support for ClickHouse. He has over three decades of experience in data management spanning 20 different DBMS types. ClickHouse is his current favorite. ;)
Alexander Sapin from Yandex presents reasoning, design considerations, and implementation of ClickHouse Keeper. It replaces ZooKeeper in ClickHouse clusters, thereby simplifying operation enormously.
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Altinity Ltd
The document provides an overview of ClickHouse and techniques for optimizing performance. It discusses how the ClickHouse query log can help understand query execution and bottlenecks. Methods covered for improving performance include adding indexes, optimizing data layout through partitioning and ordering, using encodings to reduce data size, and materialized views. Storage optimizations like multi-disk volumes and tiered storage are also introduced.
ClickHouse Deep Dive, by Aleksei MilovidovAltinity Ltd
This document provides an overview of ClickHouse, an open source column-oriented database management system. It discusses ClickHouse's ability to handle high volumes of event data in real-time, its use of the MergeTree storage engine to sort and merge data efficiently, and how it scales through sharding and distributed tables. The document also covers replication using the ReplicatedMergeTree engine to provide high availability and fault tolerance.
A day in the life of a click house query webinar
Why do queries run out of memory? How can I make my queries even faster? How should I size ClickHouse nodes for best cost-efficiency? The key to these questions and many others is knowing what happens inside ClickHouse when a query runs. This webinar is a gentle introduction to ClickHouse internals, focusing on topics that will help your applications run faster and more efficiently. We’ll discuss the basic flow of query execution, dig into how ClickHouse handles aggregation and joins, and show you how ClickHouse distributes processing within a single CPU as well as across many nodes in the network. After attending this webinar you’ll understand how to open up the black box and see what the parts are doing.
Data warehouse on Kubernetes - gentle intro to Clickhouse Operator, by Robert...Altinity Ltd
San Diego Cloud Native Computing Meetup, January 23, 2020
Presented by Robert Hodges, Altinity CEO
Data services are the latest wave of applications to catch the Kubernetes bug, but how many people would guess that includes data warehouses? We proved it works by developing the ClickHouse Kubernetes operator, which is now in production use at companies like Mux.com. It's an open source operator to stand up and run ClickHouse, a popular Apache 2.0 data warehouse that can return queries on trillions of rows in seconds or less. This talk introduces ClickHouse and shows why it's a 'cloud friendly' DBMS. We'll go mano-a-mano with the ClickHouse operator, showing how you can spin up data warehouses in 60 seconds or less. We'll cover issues like storage management, monitoring and upgrade. In short, everything you need to know to try running your own ClickHouse data warehouses on Kubernetes.
This document provides an overview and introduction to ClickHouse, an open source column-oriented data warehouse. It discusses installing and running ClickHouse on Linux and Docker, designing tables, loading and querying data, available client libraries, performance tuning techniques like materialized views and compression, and strengths/weaknesses for different use cases. More information resources are also listed.
Ramazan Polat gives 10 good reasons to use ClickHouse, including that it has blazing fast inserts and selects that can handle billions of rows sub-second. It scales linearly across machines and compresses data effectively. ClickHouse is also production ready with features like fault tolerance, replication, and integration capabilities. It has powerful table functions like arrays, nested columns, and materialized views. ClickHouse also has a great SQL implementation and ecosystem.
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesAltinity Ltd
Slides for the Webinar, presented on March 6, 2019
For the webinar video visit https://www.altinity.com/
Extracting business insight from massive pools of machine-generated data is the central analytic problem of the digital era. ClickHouse data warehouse addresses it with sub-second SQL query response on petabyte-scale data sets. In this talk we'll discuss the features that make ClickHouse increasingly popular, show you how to install it, and teach you enough about how ClickHouse works so you can try it out on real problems of your own. We'll have cool demos (of course) and gladly answer your questions at the end.
Speaker Bio:
Robert Hodges is CEO of Altinity, which offers enterprise support for ClickHouse. He has over three decades of experience in data management spanning 20 different DBMS types. ClickHouse is his current favorite. ;)
ClickHouse Mark Cache, by Mik Kocikowski, CloudflareAltinity Ltd
The document summarizes how ClickHouse stores and retrieves data from MergeTree tables. It discusses how data is stored in parts organized by primary key, with each column's data and marks stored in separate files. It describes how the primary index and mark cache are used to efficiently find and read data, and how mark cache performance impacts SELECT queries. It provides examples of calculating mark sizes and dropping the mark cache.
Size Matters-Best Practices for Trillion Row Datasets on ClickHouse-2202-08-1...Altinity Ltd
ClickHouse is so fast that virtually any developer can get a sub-second response on tables running into billions of rows. It’s different once you reach data sizes in the hundreds of billions or trillions of rows. This webinar walks you through best practices for designing a schema, loading data, and running queries on very large datasets. Expert tricks like combining events in a single fact table, using aggregation to simulate joins, and using materialized views to “index” interesting events in large fact tables are all covered. We’ll even demonstrate the ideas on a trillion-row test data set. Want to scale your data? This webinar is the place to start.
#ClickHouseKeeper #ClickHouse #OpenSourceDatabase #ClickHouseCommunity #Altinity
-----------------
Join ClickHouse Meetups: https://www.meetup.com/San-Francisco-Bay-Area-ClickHouse-Meetup
Check out more ClickHouse resources: https://altinity.com/resources/
Visit the Altinity Documentation site: https://docs.altinity.com/
Contribute to ClickHouse Knowledge Base: https://kb.altinity.com/
Join the ClickHouse Reddit community: https://www.reddit.com/r/Clickhouse/
----------------
Learn more about Altinity!
Site: https://www.altinity.com
LinkedIn: https://www.linkedin.com/company/altinity
Twitter: https://twitter.com/AltinityDB
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOAltinity Ltd
From webinar on December 3, 2019
New users of ClickHouse love the speed but may run into a few surprises when designing applications. Column storage turns classic SQL design precepts on their heads. This talk shares our favorite tricks for building great applications. We'll talk about fact tables and dimensions, materialized views, codecs, arrays, and skip indexes, to name a few of our favorites. We'll show examples of each and also reserve time to handle questions. Join us to take your next step to ClickHouse guruhood!
Speaker Bio:
Robert Hodges is CEO of Altinity, which offers enterprise support for ClickHouse. He has over three decades of experience in data management spanning 20 different DBMS types. ClickHouse is his current favorite. ;)
Fast Insight from Fast Data: Integrating ClickHouse and Apache KafkaAltinity Ltd
This document discusses using Kafka as a messaging system with ClickHouse for high throughput and low latency data ingestion. It provides an overview of Kafka and how it can be used with ClickHouse, including creating Kafka topics, Kafka and ClickHouse tables, materialized views to transfer data, and best practices. It also covers alternatives to the ClickHouse Kafka engine and the roadmap for further improving the Kafka integration.
Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021Altinity Ltd
ClickHouse is open source. You can build it yourself. What’s more, you can make it better! In this webinar, we’ll demonstrate how to pull the ClickHouse code from Github and build it. We’ll then walk through how to contribute a new feature to ClickHouse by developing, testing, and pushing a pull request through the community merge process. There will be demos and ample time for questions. Join us to get started as a ClickHouse developer!
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesAltinity Ltd
From webinars September 11 and September 17, 2019
ClickHouse is famous for speed. That said, you can almost always make it faster! This webinar uses examples to teach you how to deduce what queries are actually doing by reading the system log and system tables. We'll then explore standard ways to increase query speed: data types and encodings, filtering, join reordering, skip indexes, materialized views, session parameters, to name just a few. In each case we'll circle back to query plans and system metrics to demonstrate changes in ClickHouse behavior that explain the boost in performance. We hope you'll enjoy the first step to becoming a ClickHouse performance guru!
Speaker Bio:
Robert Hodges is CEO of Altinity, which offers enterprise support for ClickHouse. He has over three decades of experience in data management spanning 20 different DBMS types. ClickHouse is his current favorite. ;)
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfAltinity Ltd
Robert Hodges from Altinity, an enterprise provider of ClickHouse and developer of the ClickHouse Kubernetes operator, provides an introduction to running ClickHouse on Kubernetes. The presentation demonstrates how to deploy ClickHouse and Zookeeper on Kubernetes using the ClickHouse Kubernetes operator. It shows how to define ClickHouse installations using custom resources, access ClickHouse, and update the cluster configuration, such as changing the number of shards and replicas or ClickHouse version. The operator automatically applies configuration changes to pods.
Introduction to Jenkins X - a beginner's guideAndrew Bayer
The document provides an overview of Jenkins X, including:
- What Jenkins X is and how it differs from Jenkins
- How to install Jenkins X using the CLI
- Capabilities like quickstarts, build packs, pipelines, environments and ChatOps
- When Jenkins X would be suitable compared to manually configuring Kubernetes
- Next steps to learn more about Jenkins X
Deploying Windows Apps to Kubernetes with Draft and HelmJessica Deen
This document provides an overview and summary of a presentation about deploying Windows applications with Kubernetes, Draft, and Helm. It begins with introductions and disclaimers about the hands-on lab. It then discusses the history of building Kubernetes on Windows and mixed clusters. The presentation demonstrates deploying applications across Windows and Linux nodes using kubectl and shows how to consider Windows-specific aspects like resource limits and node selection. It also covers Helm charts for application deployment and management and using Draft to simplify and automate application development and deployment to Kubernetes.
This document provides an introduction and agenda for a two-day training on Kubernetes. Day one will cover Kubernetes concepts like pods, services, replica sets, deployments and namespaces. It will also include hands-on exercises. Day two will focus on additional concepts like config maps, secrets, auto-scaling and Helm. It will end with further hands-on experience and conclusions.
This document outlines the steps to run Kubernetes locally, including required installations like Java 8, Maven, Git, Kubernetes CLI (kubectl), Minikube, and Docker. It discusses benefits like cloud-native development and testing applications locally before deploying to cloud providers. The steps covered include starting Minikube, building and pushing a Docker image to Minikube's registry, deploying microservices interactively with kubectl or declaratively with YAML files, exposing services, and testing before stopping Minikube.
OSS Japan 2019 service mesh bridging Kubernetes and legacySteve Wong
how to join legacy VMs and bare metal machines to a Kubernetes service mesh so that VMs can consume Kubernetes services AND publish services used by Kubernetes hosted applications
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Ltd
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data - Presentation Slides
Altinity.Cloud is a fully automated cloud service for ClickHouse that is optimized for real-time analytics.
In this webinar, we’ll explain how Altinity.Cloud works, then show how to set up your first ClickHouse cluster. We’ll then tour important features like scale-up, scale-out, uptime schedules, and DBA tools to analyze your tables.
You’ll learn everything necessary to start working on real-time analytics today.
Bring your questions!
Presenters: Robert Hodges & Alexander Zaitsev
Note: This webinar will be recorded and later posted on our Webinar page (https://altinity.com/webinarspage/) or Altinity official Youtube channel (https://www.youtube.com/@Altinity).
DevOpsCon Berlin: Helm vs Operators – Do I Need to Decide?Nico Meisenzahl
In this session, Nico will talk about Helm and Operators as well as the battle between them. If you ever ask yourself one of the following questions, this talk is the right one for you:
"Do I have to decide?"
"Do I need to learn both?"
"What are the best tools for my use-case?"
In short: There is no battle. Both tools have different scopes. Nico will talk about the pros and cons, and what the projects themselves focus on. Join his talk to learn more!
This document provides an agenda and instructions for learning Kubernetes in 90 minutes. The agenda includes exercises on running a first web service in Kubernetes, revisiting pods, deployments and services, deploying with YAML files, and installing a microservices application called Guestbook. Key Kubernetes concepts covered include pods, deployments, services, YAML descriptors, and using deployments to scale applications. The document also provides background on containers, Docker, and the Kubernetes architecture.
KubeOne is an open source tool for managing the lifecycle of Kubernetes clusters, including installing, upgrading, and decommissioning clusters on major cloud providers and on-premises. It uses tools like kubeadm and Kubermatic machine-controller to provision clusters in a declarative way. The presentation demonstrates installing a highly available Kubernetes cluster on AWS using KubeOne by defining a cluster manifest and running the install command.
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfAltinity Ltd
The document discusses Altinity.Cloud Anywhere, a service that allows users to run ClickHouse databases on their own Kubernetes clusters. It provides automation of ClickHouse operations and management through the Altinity Connector. Users can prepare their own Kubernetes environment, connect it to Altinity.Cloud, and then launch and manage ClickHouse clusters on their infrastructure. Advanced topics covered include how the service works internally and how to get support from Altinity.
The document discusses various Kubernetes concepts including pods, deployments, services, ingress, labels, health checks, config maps, secrets, volumes, autoscaling, resource quotas, namespaces, Helm, and the Kubernetes Dashboard. Kubernetes is a container orchestration tool that manages container deployment, scaling, and networking. It uses pods to group containers, deployments to manage pods, and services for exposing applications.
Kubespray and Ansible can be used to automate the installation of Kubernetes in a production-ready environment. Kubespray provides tools to configure highly available Kubernetes clusters across multiple Linux distributions. Ansible is an IT automation tool that can deploy software and configure systems. The document then provides a 6 step guide for installing Kubernetes on Ubuntu using kubeadm, including installing Docker, kubeadm, kubelet and kubectl, disabling swap, configuring system parameters, initializing the cluster with kubeadm, and joining nodes. It also briefly explains Kubernetes architecture including the master node, worker nodes, addons, CNI, CRI, CSI and key concepts like pods, deployments, networking,
Installing and Using Kubernetes is hard, but Operating Kubernetes is even harder! This BOF is for Kubernetes Operators to get together and discuss our day to day Operations, and for people new to Kubernetes to learn more about how to operate it.
This document discusses running CI/CD pipelines with VMware Cloud PKS using Jenkins X. It provides an overview of Jenkins X and how it extends Kubernetes with custom resource definitions. Jenkins X allows automation of CI/CD pipelines through a single CLI interface and includes features like GitOps promotion of applications between environments. The document also compares Jenkins X to Jenkins and notes how Jenkins X is focused specifically on Kubernetes and implementing best practices for native Kubernetes CI/CD.
Almost 3 years with Kubernetes and some "war stories", we will take the top-down approach to kubernetes and take a glimpse of the bottom-up and where we could customize it.
This document introduces the CitusTM IoT Ecosystem, which allows users to develop and integrate IoT products, visualize sensor data, and build sharing economy business models on a centralized platform. It can be deployed on dedicated or shared infrastructure using Docker Compose, Kubernetes, or AWS CloudFormation. The ecosystem provides services for device management, sensor analytics, recognition applications, and more through container-based microservices that can be easily deployed and shared across users. Setup instructions are included to deploy the ecosystem locally using Docker Compose or on AWS using a CloudFormation template.
Mattia Gandolfi - Improving utilization and portability with Containers and C...Codemotion
Google has pioneered the usage of containers at huge scale. Learn how we designed our systems to handle insane traffic loads, orchestrating complex, globally distributed applications, and how you can leverage this infrastructure and our agile development technologies to embrace the power of DevOps and Cloud on our Google Cloud Platform.
This document provides an overview of Kubernetes and how it compares to VMware technologies. It begins with an analogy that containers are to operating systems what virtual machines are to server hardware. It then discusses how Kubernetes orchestrates multiple containers across nodes by splitting applications into smaller services. The remainder of the document discusses key Kubernetes concepts like pods, replica sets, deployments and services. It provides a mapping of how Kubernetes concepts compare to VMware concepts like vCenter and vSphere hosts. It also discusses considerations for installing Kubernetes and operating it at scale.
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxAltinity Ltd
Building an Analytic Extension to MySQL with ClickHouse and Open Source
In this webinar Percona and Altinity offer suggestions and tips on how to recognize when MySQL is overburdened with analytics and can benefit from ClickHouse’s unique capabilities.
Also, they will walk you through important patterns for integrating MySQL and ClickHouse which will enable the building of powerful and cost-efficient applications that leverage the strengths of both databases.
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Altinity Ltd
Over the last few years Kubernetes has transitioned from an object of curiosity and fear to a robust platform for big data. Watch this webinar and you will learn how the Altinity Kubernetes Operator for ClickHouse enables users to run high performance analytics on ClickHouse. You will see a simple installation and teach you how to scale it into a cluster that can analyze 100s of terabytes of data. Along the way we’ll share our lessons for ClickHouse on Kubernetes in Altinity.Cloud. We built it on Kubernetes using the Altinity Operator and now run hundreds of clusters in the cloud. You can too!
Building an Analytic Extension to MySQL with ClickHouse and Open SourceAltinity Ltd
This is a joint webinar Percona - Altinity.
In this webinar we will discuss suggestions and tips on how to recognize when MySQL is overburdened with analytics and can benefit from ClickHouse’s unique capabilities.
We will then walk through important patterns for integrating MySQL and ClickHouse which will enable the building of powerful and cost-efficient applications that leverage the strengths of both databases.
Fun with ClickHouse Window Functions-2021-08-19.pdfAltinity Ltd
Fun with ClickHouse Window Functions | Altinity Webinar
Window functions have arrived in ClickHouse!
Our webinar will start with an introduction to standard window function syntax and show how it is implemented in ClickHouse. We’ll next show you problems that you can now solve easily using window functions. Finally, we’ll compare window functions to arrays, another powerful ClickHouse feature.
There will be time for questions with our SQL experts.
Join us for a complete overview of this long-awaited feature!
Speakers:
Robert Hodges, CEO @Altinity
Vitaliy Zakaznikov, QA Manager and Architect @Altinity
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Altinity Ltd
Altinity Stable Builds offer a ClickHouse distribution that is ready for production use and with 3 years of maintenance. Our webinar introduces the special features of Stable Builds and describes how we build them from ClickHouse Long-Term Support (LTS) releases. We’ll show you how to find them and install them yourself, then guide you through the important topic of upgrading. We’ll also walk through how to use Altinity Stable Builds in Altinity.Cloud, our managed ClickHouse platform for high-performance analytics.
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Altinity Ltd
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHouse Webinar Slides
Monitoring is the key to the successful operation of any software service, but commercial solutions are complex, expensive, and slow. Let us show you how to build monitoring that is simple, cost-effective, and fast using open-source stacks easily accessible to any developer.
We’ll start with the elements of monitoring systems: data ingest, query engine, visualization, and alerting. We’ll then explain and contrast two implementation approaches. The first uses VictoriaMetrics, a fast-growing, high-performance time series database that uses PromQL for queries. The second is based on ClickHouse, a popular real-time analytics database that speaks SQL. Fast, affordable monitoring is within reach. This webinar provides designs and working code to get you there.
Presented by:
Roman Khavronenko, Co-Founder at VictoriaMetrics
Robert Hodges, CEO at Altinity
ClickHouse ReplacingMergeTree in Telecom AppsAltinity Ltd
Alexandr Dubovikov of QXIP explains how to use ClickHouse ReplacingMergeTree engine for an important Telecom use case: tracking state of calls from incoming call detail records aka CDRs. (https://www.meetup.com/san-francisco-bay-area-clickhouse-meetup/events/289605843/)
Adventures with the ClickHouse ReplacingMergeTree EngineAltinity Ltd
Presentation on ReplacingMergeTree by Robert Hodges of Altinity at the 14 December 2022 SF Bay Area ClickHouse Meetup (https://www.meetup.com/san-francisco-bay-area-clickhouse-meetup/events/289605843/)
Building a Real-Time Analytics Application with Apache Pulsar and Apache PinotAltinity Ltd
This document provides an overview of building a real-time analytics application with Apache Pulsar and Apache Pinot. It introduces Mary Grygleski and Mark Needham, describes what real-time analytics is, and discusses the properties of real-time analytics systems. It then demonstrates how to ingest data from the Wikimedia recent changes feed into Pulsar and Pinot for real-time analytics and builds a dashboard with the data using Streamlit.
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...Altinity Ltd
OSA Con 2022: What Data Engineering Can Learn from Frontend Engineering
Pete Hunt - Elementl
Frontend engineering went through a revolution in the last decade. I'll recap what happened, and how a similar revolution started in data engineering.
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfAltinity Ltd
OSA Con 2022: Welcome to OSA CON Version 2022
Robert Hodges - Altinity
Join us as we guide you through the conference and highlight the many presenters who are contributing talks.
We'll also include a few tips about how to use the conference platform.
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...Altinity Ltd
OSA Con 2022: Using ClickHouse Database to Power Analytics and Customer Engagement Platform
Prafulla Gupta - Times Internet
This talk covers how we empowered Product Managers and Editors at Times Internet by developing an in-house product, GrowthRx, using Clickhouse Open Source Database to track and analyze user behavior to increase user retention and customer engagement. Times Internet is India's largest digital news publisher, which manages leading brands like Times of India, Economic Times, Navbharat Times, etc, where we are tracking more than 10 billion events per month in the ClickHouse Database.
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...Altinity Ltd
- ClickHouse can query 170 billion rows at 500 queries per second with a 99th percentile latency of 110ms through careful data modeling, query optimization, and use of materialized views.
- To achieve low latency at high query rates, it is important to reduce the amount of data scanned by queries through techniques like sorting keys, data compression, and reducing data cardinality.
- Materialized views can reduce data sizes by 1000-10,000x and are critical for maintaining low query latencies on large datasets. Dividing data into read and write replicas also improves query performance.
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...Altinity Ltd
OSA Con 2022: The Open Source Analytic Universe, Version 2022
Robert Hodges - Altinity
Every generation builds new cathedrals. For many of us, this means implementing analytic applications built on a foundation of open source.
We'll survey developments in analytics since the last OSA Con and highlight new technologies that developers should be watching as we head into the mid-2020s.
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...Altinity Ltd
The document discusses how OpsVerse migrated their Jaeger distributed tracing storage from Cassandra to ClickHouse for improved performance monitoring. Jaeger is an open source distributed tracing system that was originally designed to use Elasticsearch or Cassandra for storage. While Cassandra worked well for basic functionality, it lacked capabilities for advanced analytics. ClickHouse supports richer query functions and better handles large datasets. The document outlines the steps OpsVerse took to implement the ClickHouse storage plugin for Jaeger and deploy ClickHouse on Kubernetes using the ClickHouse Operator. This migration enabled more insightful performance monitoring and analytics.
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...Altinity Ltd
OSA Con 2022: Streaming Data Made Easy
Tim Spann & David Kjerrumgaard - StreamNative
Click into new streaming applications the easy way with Apache Pulsar, Clickhouse, and Open Source. A quick introduction to how to build modern data streaming applications.
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfAltinity Ltd
OSA Con 2022 - State of Open Source Databases
Peter Zaitsev - Percona
It has been an exciting year in the open-source database industry, with more choices, more cloud, and key changes in the industry. We will dive into the key developments over 2022, including the most important open-source database software releases in general, the significance of cloud-native solutions in a multi-vendor multi-cloud world, the new criticality of security challenges, and the evolution of the open-source software industry.
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...Altinity Ltd
OSA Con 2022: Specifics of data analysis in Time Series Databases
Roman Khavronenko - VictoriaMetrics
Time series data is special. Not only its nature but also the ways that we store and interact with it.
In this talk, we'll cover the differences between storing time series data in classic relational databases
and a new generation of time series databases like VictoriaMetrics and Prometheus.
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...Altinity Ltd
OSA Con 2022: Signal Correlation, the Ho11y Grail
Michael Hausenblas - AWS.pdf
Michael shows how the signal correlation in observability use cases helps you to spot issues faster, optimize code, or make you more productive in delivering features.
OSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdfAltinity Ltd
OSA Con 2022: Scaling your Pandas Analytics with Modin
Doris Lee - Ponder
Pandas is one of the most commonly used data science libraries in Python, with a convenient set of APIs for data cleaning, visualization, analysis, and exploration. However, despite its widespread adoption, Pandas suffers from severe scalability issues on large datasets. We developed the open-source project Modin, which is a fast, scalable drop-in replacement for pandas. Modin has been downloaded more than 4 million times and is used by leading data science teams, including Fortune 100 companies.
AWS re:Invent 2023 - Deep dive into Amazon Aurora and its innovations DAT408Grant McAlister
With an innovative architecture that decouples compute from storage and advanced features like Global Database and low-latency read replicas, Amazon Aurora reimagines what it means to be a relational database. Aurora is a modern database service offering unparalleled performance and high availability at scale with full open source MySQL and PostgreSQL compatibility. In this session, dive deep into the most exciting new features Aurora offers, including Aurora I/O-Optimized, Aurora zero-ETL integration with Amazon Redshift, and Aurora Serverless v2. Learn how the addition of the pgvector extension allows for the storage of vector embeddings and support of vector similarity searches for generative AI.
The Rise of Python in Finance,Automating Trading Strategies: _.pdfRiya Sen
In the dynamic realm of finance, where every second counts, the integration of technology has become indispensable. Aspiring traders and seasoned investors alike are turning to coding as a powerful tool to unlock new avenues of financial success. In this blog, we delve into the world of Python live trading strategies, exploring how coding can be the key to navigating the complexities of the market and securing your path to prosperity.