GPU data warehouse Sqream DB provides a massively parallel processing engine powered by GPUs that is faster and more efficient than CPU-based systems. It can ingest terabytes of data per hour onto a single GPU and handle petabytes of data stored in a compact 2U server. With familiar SQL queries and connectors, Sqream DB accelerates analytics by 100x over traditional warehouses through its GPU-accelerated processing and columnar storage.
There’s a popular misconception about I/O that (modern) SSDs are easy to deal with; they work pretty much like RAM but use a “legacy” submit-complete API. And other than keeping in mind a disk’s possible peak performance and maybe maintaining priorities of different IO streams there’s not much to care about. This is not quite the case – SSDs do show non-linear behavior and understanding the disk’s real abilities is crucial when it comes to squeezing as much performance from it as possible.
Diskplorer is an open-source disk latency/bandwidth exploring toolset. By using Linux fio under the hood it runs a battery of measurements to discover performance characteristics for a specific hardware configuration, giving you an at-a-glance view of how server storage I/O will behave under load.
ScyllaDB CTO Avi Kivity will share an interesting approach to measuring disk behavior under load, give a walkthrough of Diskplorer and explain how it’s used.
With the elaborated model of a disk at hand, it becomes possible to build latency-oriented I/O scheduling that cherry-picks requests from the incoming queue keeping the disk load perfectly Balanced.
ScyllaDB engineer Pavel Emelyanov will also present the scheduling algorithm developed for the Seastar framework and share results achieved using it.
RedisConf17 - IoT Backend with Redis and Node.jsRedis Labs
This document describes an IoT backend solution using Redis and S3 to address the problems of supporting large numbers of IoT devices in a cost effective and scalable way. The key aspects of the solution are:
1) Storing daily energy data from devices in Redis lists with TTLs and historical data in S3 files to allow querying recent and historical data separately.
2) Modeling device and account data as Redis hashes and indexes as sets/zsets to enable flexible querying.
3) Using Redis to track API usage and implement throttling to prevent misuse, storing counts by API key and time bucket with expiration.
4) The solution supports 100k devices on a single low-
Renegotiating the boundary between database latency and consistencyScyllaDB
With the increasing complexity of modern distributed systems, concerns around latency, availability, and consistency have become almost 'universal'. In response, a new generation of distributed databases is taking over: databases capable of harnessing the power and capabilities of the multi-cloud ecosystem. This new generation of distributed databases is challenging many of the traditional tradeoffs between relational and non-relational models.
This webinar will explore the technologies and trends behind this new generation of distributed databases, then take a technical deep dive into one example: the open source non-relational database ScyllaDB. ScyllaDB was built specifically for extreme low latencies, but has recently increased consistency by implementing the Raft consensus protocol. Engineers will share how they are implementing a low-latency architecture, and how strongly consistent topology and schema changes enable highly reliable and safe systems, without sacrificing low-latency characteristics.
Netflix stores 98 percent of data related with streaming services: right from bookmarks, viewing history to billing and payment information. These services / applications simply desire highly available and scalable persistence solution to keep themselves running efficiently in a normal and disastrous situation. How does Netflix plan for capacity for it's new as well as existing services?
In this talk, Arun Agrawal, Senior Software Engineer and Ajay Upadhyay, Cloud Data Architect @Netflix will talk about the capacity planning and capacity forecasting in cassandra world.
We will take you through the science behind forecasting the short and long term usage and auto-scaling adequate capacity well before C* clusters reach their limit. This guarantees highly scalable and available persistence solution meeting our SLAs @ Netflix.
About the Speakers
ajay upadhyay Senior Database Engineer, Netflix
Responsible for persistent layer at Netflix, part of CDE [Cloud Database Engineering] team. Working with application team, suggesting and guiding them with the best practices for various persistent layers provided by CDE team.
Arun Agrawal Senior Software Engineer, Netflix
Arun Agrawal is part of Cloud Database Engineering where they provide CAAS (Cassandra as a service). Ensuring smooth operations of service and finding innovative ways to reduce the management overheads of having CAAS.
Running Analytics at the Speed of Your BusinessRedis Labs
The speed at which you can extract insights from your data is increasingly a competitive edge for your business. Data and analytics have to be at lightning fast speeds to seriously impact your user acquisition.
Join this webinar featuring Forrester analyst Noel Yuhanna and Leena Joshi, VP Product Marketing at Redis Labs to learn how you can glean insights faster with new open source data processing frameworks like Spark and Redis.
In this webinar you will learn:
* Why analytics has to run at the real time speed of business
* How this can be achieved with next generation Big Data tools
* How data structures can optimize your hybrid transaction-analytics processing scenarios
Get Your Head in the Cloud - Lessons in GPU Computing with Schlumbergerinside-BigData.com
In this presentation from the GPU Technology Conference, Wyatt Gorman from Google and Abhishek Gupta from Schlumberger present: Get Your Head in the Cloud - Lessons in GPU Computing with Schlumberger.
"Demand for GPUs in High Performance Computing is only growing, and it is costly and difficult to keep pace in an entirely on-premise environment. We will hear from Schlumberger on why and how they are utilizing cloud-based GPU-enabled computing resources from Google Cloud to supply their users with the computing power they need, from exploration and modeling to visualization."
Watch the video: https://wp.me/p3RLHQ-kcl
Learn more: https://www.blog.google/products/google-cloud/schlumberger-chooses-gcp-to-deliver-new-oil-and-gas-technology-platform/
and
https://www.nvidia.com/en-us/gtc/
Yahoo - Moving beyond running 100% of Apache Pig jobs on Apache TezDataWorks Summit
Last year at Yahoo, we spent great effort in scaling, stabilizing and making Pig on Tez production ready and by the end of the year retired running Pig jobs on Mapreduce. This talk will detail the performance and resource utilization improvements Yahoo achieved after migrating all Pig jobs to run on Tez.
After successful migration and the improved performance we shifted our focus to addressing some of the bottlenecks we identified and new optimization ideas that we came up with to make it go even faster. We will go over the new features and work done in Tez to make that happen like custom YARN ShuffleHandler, reworking DAG scheduling order, serialization changes, etc.
We will also cover exciting new features that were added to Pig for performance such as bloom join and byte code generation. A distributed bloom join that can create multiple bloom filters in parallel was straightforward to implement with the flexibility of Tez DAGs. It vastly improved performance and reduced disk and network utilization for our large joins. Byte code generation for projection and filtering of records is another big feature that we are targeting for Pig 0.17 which will speed up processing by reducing the virtual function calls.
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化NVIDIA Taiwan
The document discusses using GPUs on Google Cloud Platform for accelerating compute-intensive workloads. It describes how GPUs can provide significant performance gains for machine learning, high performance computing, and visualization workloads. It provides examples of customers like Schlumberger leveraging GPUs on GCP for oil exploration and Shazam for music fingerprinting. The document also highlights the flexibility, scalability, and cost benefits of using GPUs on Google Cloud Platform.
RedisConf17 - Home Depot - Turbo charging existing applications with RedisRedis Labs
The Home Depot is transforming its architecture to use microservices and polyglot persistence to handle increasing online order volumes of 250,000 lines per hour. Redis is being used to turbo charge existing monolithic applications by offloading pieces to new processes using patterns like caching, concurrency management, and powering algorithms. This improves performance by reducing database degradation and wait times by over 95%. Next steps include setting up Redis clusters on-premises and off-premises to further reduce database CPU usage and onboard more patterns.
RedisConf17 - Turbo-charge your apps with Amazon Elasticache for RedisRedis Labs
This document provides an overview and summary of Amazon ElastiCache for Redis. It discusses the key features of ElastiCache including easy deployment and monitoring, enhanced Redis engine capabilities, high availability, cost effectiveness, and integration with other AWS services. It also covers usage patterns such as database caching, streaming data processing, and building real-time apps. Finally, it discusses best practices for building resilient architectures on ElastiCache including reference architectures, failure scenarios, and open source contributions from AWS.
Critical Attributes for a High-Performance, Low-Latency DatabaseScyllaDB
This document discusses the attributes of a high-performance, low-latency database like ScyllaDB. It begins with introductions and an overview of ScyllaDB. It then summarizes how hardware has evolved over 20 years with more cores, memory, and faster disks. ScyllaDB was redesigned from first principles to take advantage of modern hardware, using an asynchronous, shared-nothing architecture with one shard per core. This allows it to achieve significantly higher performance than Cassandra. The document shows benchmark results demonstrating ScyllaDB's lower latencies and ability to scale to higher throughput. It also discusses how ScyllaDB uses workload prioritization to manage different types of workloads.
Real-time Machine learning with Redis-ML
Shay Nativ from Redis Labs presented on using Redis and Redis-ML for real-time machine learning model serving. Redis-ML allows training models with tools like Spark and then deploying them to Redis for low-latency serving. This simplifies the ML lifecycle and improves performance and scalability compared to custom model serving. Shay demonstrated building a movie recommendation system using Spark for training random forests on the MovieLens dataset and deploying the models to Redis-ML for real-time recommendations with 60x faster performance than Spark alone.
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed_Hat_Storage
This document discusses how data growth driven by mobile, social media, IoT, and big data/cloud is requiring a fundamental shift in storage cost structures from scale-up to scale-out architectures. It provides an overview of key storage technologies and workloads driving public cloud storage, and how Ceph can help deliver on the promise of the cloud by providing next generation storage architectures with flash to enable new capabilities in small footprints. It also illustrates the wide performance range Ceph can provide for different workloads and hardware configurations.
Pilot Hadoop Towards 2500 Nodes and Cluster RedundancyStuart Pook
Hadoop has become a critical part of Criteo's operations. What started out as a proof of concept has turned into two in-house bare-metal clusters of over 2200 nodes. Hadoop contains the data required for billing and, perhaps even more importantly, the data used to create the machine learning models, computed every 6 hours by Hadoop, that participate in real time bidding for online advertising.
Two clusters do not necessarily mean a redundant system, so Criteo must plan for any of the disasters that can destroy a cluster.
This talk describes how Criteo built its second cluster in a new datacenter and how to do it better next time. How a small team is able to run and expand these clusters is explained. More importantly the talk describes how a redundant data and compute solution at this scale must function, what Criteo has already done to create this solution and what remains undone.
Fast, In-Memory SQL on Apache Cassandra with Apache Ignite (Rachel Pedreschi,...DataStax
This document discusses using Apache Ignite to enable in-memory SQL on Apache Cassandra. It provides an overview of GridGain's enterprise and open source strategies, with Ignite being based on the open source version. It then discusses EPAM's engineering capabilities. The remainder discusses Ignite's capabilities for scalable SQL queries with ACID transactions on Cassandra and provides a demo comparing performance of OLTP and OLAP queries between Cassandra and Ignite. Contact information and URLs for more information on Ignite and using it with Cassandra are also provided.
Spark + Flashblade: Spark Summit East talk by Brian GoldSpark Summit
Modern infrastructure and applications generate extraordinary volumes of log and telemetry data. At Pure Storage, we know this first hand: we have over 5PB of log data from production customers running our all-flash storage systems, from our engineering testbeds, and from test stations at manufacturing partners. Every part of our company — from engineering to sales — now depends on the insights we gather from this data. Given the diversity of our end users, it’s no surprise that our analysis tools comprise a broad mix of reporting queries, stream-processing operations, ad-hoc analyses, and deeper machine-learning algorithms. In this session, we will cover lessons learned from scaling our data warehouse and how we are leveraging Apache Spark’s capabilities as a central hub to meet our analytics demands.
Realtime Analytical Query Processing and Predictive Model Building on High Di...Spark Summit
Spark SQL and Mllib are optimized for running feature extraction and machine learning algorithms on row based columnar datasets through full scan but does not provide constructs for column indexing and time series analysis. For dealing with document datasets with timestamps where the features are represented as variable number of columns in each document and use-cases demand searching over columns and time to retrieve documents to generate learning models in realtime, a close integration within Spark and Lucene was needed. We introduced LuceneDAO in Spark Summit Europe 2016 to build distributed lucene shards from data frame but the time series attributes were not part of the data model. In this talk we present our extension to LuceneDAO to maintain time stamps with document-term view for search and allow time filters. Lucene shards maintain the time aware document-term view for search and vector space representation for machine learning pipelines. We used Spark as our distributed query processing engine where each query is represented as boolean combination over terms with filters on time. LuceneDAO is used to load the shards to Spark executors and power sub-second distributed document retrieval for the queries.
Our synchronous API uses Spark-as-a-Service to power analytical queries while our asynchronous API uses kafka, spark streaming and HBase to power time series prediction algorithms. In this talk we will demonstrate LuceneDAO write and read performance on millions of documents with 1M+ terms and configurable time stamp aggregate columns. We will demonstrate the latency of APIs on a suite
of queries generated from terms. Key takeaways from the talk will be a thorough understanding of how to make Lucene powered time aware search a first class citizen in Spark to build interactive analytical query processing and time series prediction algorithms.
SQream DB is designed for high-throughput analytics and takes advantage of IBM Power Systems architectures like Power9 that support high-bandwidth NVLink connections between the CPU and GPU. Benchmark tests showed SQream DB on an IBM Power9 system could load and query data 1.5 to 3.7 times faster than comparable x86-based systems due to the faster NVLink interconnect. SQream DB partitions and compresses data across the CPU and GPU for parallel processing to achieve high performance.
RedisConf17 - Building Large High Performance Redis Databases with Redis Ente...Redis Labs
This document discusses building large databases with Redis Enterprise (Redise) using flash memory. It introduces Redis Labs and their Redise product, which uses a clustered architecture to scale Redis deployments. Redise allows scaling data beyond RAM by extending into flash memory at a lower cost than using only RAM. Performance tests show Redise running on Intel Optane SSDs can achieve up to 9x higher throughput than traditional SSDs for large datasets. The document advocates Redise Flash as a cost-effective way to handle massive datasets with near-RAM latency.
PG-Strom is an extension of PostgreSQL that utilizes GPUs and NVMe SSDs to enable terabyte-scale data processing and in-database analytics. It features SSD-to-GPU Direct SQL, which loads data directly from NVMe SSDs to GPUs using RDMA, bypassing CPU and RAM. This improves query performance by reducing I/O traffic over the PCIe bus. PG-Strom also uses Apache Arrow columnar storage format to further boost performance by transferring only referenced columns and enabling vector processing on GPUs. Benchmark results show PG-Strom can process over a billion rows per second on a simple 1U server configuration with an NVIDIA GPU and multiple NVMe SSDs.
This document provides an introduction to HeteroDB, Inc. and its chief architect, KaiGai Kohei. It discusses PG-Strom, an open source PostgreSQL extension developed by HeteroDB for high performance data processing using heterogeneous architectures like GPUs. PG-Strom uses techniques like SSD-to-GPU direct data transfer and a columnar data store to accelerate analytics and reporting workloads on terabyte-scale log data using GPUs and NVMe SSDs. Benchmark results show PG-Strom can process terabyte workloads at throughput nearing the hardware limit of the storage and network infrastructure.
"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream TechnologiesDataconomy Media
"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream Technologies
Watch more from Data Natives Tel Aviv 2016 here: http://bit.ly/2hw1MY0
Visit the conference website to learn more: http://telaviv.datanatives.io/
Follow Data Natives:
https://www.facebook.com/DataNatives
https://twitter.com/DataNativesConf
Stay Connected to Data Natives by Email: Subscribe to our newsletter to get the news first about Data Natives 2017: http://bit.ly/1WMJAqS
About the Author:
Ami Gal is the Co-Founder and CEO at SqreamTechnologies where he is producing a very fast SQL Big Database at SQream Technologies, crunching all the way from a few Terabytes to Petabytes with high performance. He is a hands-on entepreneur, a mentor at Seedcamp and SmartCamp Mentor at IBM.
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...Matej Misik
Graphics cards (GPU) open up new ways of processing and analytics over big data, showing millisecond selections over billions of lines, as well as telling stories about data. #QikkDB
How to present data to be understood by everyone? Data analysis is for scientists, but data storytelling is for everyone. For managers, product owners, sales teams, the general public. #TellStory
Learn about high performance computing with GPU and how to present data with a rich Covid-19 data story example on the upcoming webinar.
22by7 and DellEMC Tech Day July 20 2017 - Power EdgeSashikris
The document discusses Dell EMC's PowerEdge server solutions for modern data centers. It introduces the PowerEdge R940, R740-R740xd, R640, C6420, and M640-FC640 servers and highlights their key features. These include expanded processing, memory, storage and I/O capacity, intelligent automation capabilities, integrated security features, and workload optimization options. The servers are presented as providing adaptable, scalable and protected infrastructure for traditional and emerging workloads in the modern data center.
Harnessing the virtual realm for successful real world artificial intelligenceAlison B. Lowndes
Artificial Intelligence is impacting all areas of society, from healthcare and transportation to smart cities and energy. How NVIDIA invests both in internal pure research and accelerated computation to enable its diverse customer base, across gaming & extended reality, graphics, AI, robotics, simulation, high performance scientific computing, healthcare & more. You will be introduced to the GPU computing platform & shown real world successfully deployed applications as well as a glimpse into the current state of the art across academia, enterprise and startups.
Flash memory summit enterprise udate 2019Howard Marks
The document provides an annual update on enterprise flash storage trends. It discusses how flash has become mainstream for primary storage due to declining costs. All-flash arrays now have a larger market share than hybrid arrays. Emerging technologies discussed include NVMe over Fabrics, which extends NVMe protocols over Ethernet and Fibre Channel, and Storage Class Memory using 3D XPoint, which provides faster storage than NAND flash. The document highlights several vendors that are adopting these technologies.
The document discusses IBM's Power Systems as an expert platform for artificial intelligence. Some key points:
- Power Systems are designed for modern AI workloads, with accelerated computing capabilities like GPUs and FPGAs.
- The IBM Power AC922 server provides an "acceleration superhighway" between CPUs, GPUs, and other accelerators for optimal AI performance.
- Tests show the AC922 can reduce AI model training times by 3.8x compared to x86 systems, thanks to features like high bandwidth NVLink connections between components.
- IBM's PowerAI software tools help make AI development easier on the Power platform.
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Lablup Inc.
This slide introduces technical specs and details about Backend.AI 19.09.
* On-premise clustering / container orchestration / scaling on cloud
* Container-level fractional GPU technology to use one GPU as many GPUs on many containers at the same time.
* NVidia GPU Cloud integrations
* Enterprise features
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red_Hat_Storage
This document discusses the need for storage modernization driven by trends like mobile, social media, IoT and big data. It outlines how scale-out architectures using open source Ceph software can help meet this need more cost effectively than traditional scale-up storage. Specific optimizations for IOPS, throughput and capacity are described. Intel is presented as helping advance the industry through open source contributions and optimized platforms, software and SSD technologies. Real-world examples are given showing the wide performance range Ceph can provide.
Building a High Performance Analytics PlatformSantanu Dey
The document discusses using flash memory to build a high performance data platform. It notes that flash memory is faster than disk storage and cheaper than RAM. The platform utilizes NVMe flash drives connected via PCIe for high speed performance. This allows it to provide in-memory database speeds at the cost and density of solid state drives. It can scale independently by adding compute nodes or storage nodes. The platform offers a unified database for both real-time and analytical workloads through common APIs.
HPC Infrastructure To Solve The CFD Grand ChallengeAnand Haridass
This document summarizes Anand Haridass' presentation on using HPC infrastructure to solve computational fluid dynamics (CFD) grand challenges. It discusses how CFD utilizes physics, mathematics, computational geometry, and computer science. Solving CFD problems is bound by memory usage, computation needs, and network requirements. The presentation outlines IBM's POWER processor roadmap and how the POWER9 will have stronger cores, enhanced caches, and improved interfaces like NVLink and CAPI to accelerate workloads like CFD. Case studies demonstrate how IBM systems using GPUs and NVLink can provide faster performance for CFD codes and reservoir simulations.
Dyn delivers exceptional Internet Performance. Enabling high quality services requires data centers around the globe. In order to manage services, customers need timely insight collected from all over the world. Dyn uses DataStax Enterprise (DSE) to deploy complex clusters across multiple datacenters to enable sub 50 ms query responses for hundreds of billions of data points. From granular DNS traffic data, to aggregated counts for a variety of report dimensions, DSE at Dyn has been up since 2013 and has shined through upgrades, data center migrations, DDoS attacks and hardware failures. In this webinar, Principal Engineers Tim Chadwick and Rick Bross cover the requirements which led them to choose DSE as their go-to Big Data solution, the path which led to SPARK, and the lessons that we’ve learned in the process.
Getting Started with Big Data and HPC in the Cloud - August 2015Amazon Web Services
How can you use Big Data to grow your business and discover new opportunities? When organizations effectively capture, analyze, visualize and apply big data insights to their business goals, they differentiate themselves from their competitors and outperform them in terms of operational efficiency and the bottom line. With Amazon Web Services, businesses and researchers can easily fulfill their high performance computing (HPC) requirements with the added benefit of ad-hoc provisioning, pay-as-you-go pricing and faster time-to-results. Join this session to understand how to run HPC applications in AWS cloud, and about different AWS Big Data and Analytics services such as Amazon Elastic MapReduce (Hadoop), Amazon Redshift (Data Warehouse) and Amazon Kinesis (Streaming), when to use them and how they work together.
20181116 Massive Log Processing using I/O optimized PostgreSQLKohei KaiGai
The document describes a technology called PG-Strom that uses GPU acceleration to optimize I/O performance for PostgreSQL. PG-Strom allows data to be transferred directly from NVMe SSDs to the GPU over the PCIe bus, bypassing the CPU and RAM. This reduces data movement and allows PostgreSQL queries to be partially executed directly on the GPU. Benchmark results show the approach can achieve throughput close to the theoretical hardware limits for a single server configuration processing large datasets.
This document discusses NVIDIA's chips for automotive, HPC, and networking. For automotive, it describes the Tegra line of SOC chips used in cars like Tesla, and upcoming chips like Orin and Atlan. For HPC, it introduces the upcoming Grace CPU designed for giant AI models. For networking, it presents the BlueField line of data processing units (DPUs) including the new 400Gbps BlueField-3 chip and the DOCA software framework. The document emphasizes that NVIDIA's GPU, CPU, and DPU chips make yearly leaps while sharing a common architecture.
Big Data is everywhere these days. But what is it and how can you use it to fuel your business? Data is as important to organizations as labour and capital, and if organizations can effectively capture, analyze, visualize and apply big data insights to their business goals, they can differentiate themselves from their competitors and outperform them in terms of operational efficiency and the bottom line.
Join this session to understand the different AWS Big Data and Analytics services such as Amazon Elastic MapReduce (Hadoop), Amazon Redshift (Data Warehouse) and Amazon Kinesis (Streaming), when to use them and how they work together.
Reasons to attend:
Learn how AWS can help you process and make better use of your data with meaningful insights.
Learn about Amazon Elastic MapReduce and Amazon Redshift, fully managed petabyte-scale data warehouse solutions.
Learn about real time data processing with Amazon Kinesis.
This document provides an overview of Amazon Redshift presented by Pavan Pothukuchi and Chris Liu. The agenda includes an introduction to Redshift, its benefits, use cases, and Coursera's experience using Redshift. Some key benefits highlighted are that Redshift is fast, inexpensive, fully managed, secure, and innovates quickly. Example use cases from NTT Docomo and Nasdaq are discussed. Chris Liu then discusses Coursera's experience moving from no data warehouse to using Redshift over three years, including their current ecosystem involving Redshift, other AWS services, and business intelligence applications. Lessons learned around thinking in Redshift, communicating with users, surprises, and reflections are also shared.
The document describes a 5-day residency program hosted by the OpenPOWER Academic Discussion Group (ADG) at NIE Mysore from June 6-10, 2022. The program aims to bridge industry and academia knowledge in chip design by developing curriculum on OpenPOWER technology and training lab assistants. Engineers and academicians with 5+ years experience in chip design/verification are eligible to participate. They will collaborate on developing course materials and lab exercises to teach undergraduate students in fields like ECE and CSE. The program seeks to help fulfill India's goals in chip design manpower and self-reliance through initiatives like Make in India and the India Semiconductor Mission.
This document provides an overview of digital design and Verilog. It discusses binary numbers and boolean algebra as the foundation of digital systems. It also describes logic gates, combinational and sequential circuits, finite state machines, and datapath and control units. Finally, it introduces Verilog, describing different modeling types like gate level, behavioral, dataflow, and switch level modeling. It positions Verilog as a hardware description language used to more easily design digital circuits compared to manual drawing.
The Libre-SOC Project aims to create an entirely Libre-Licensed, transparently-developed fully auditable Hybrid 3D CPU-GPU-VPU, using the Supercomputer-class OpenPOWER ISA as the foundation.
Our first test ASIC is a 180nm "Fixed-Point" Power ISA v3.0B processor, 5.1mm x 5.9mm, as a proof-of-concept for the team, whose primary expertise is in Software Engineering. Software Engineering training brings a radically different approach to Hardware development: extensive unit tests, source code revision control, automated development tools are normal. Libre Project Management brings even more: bug trackers, mailing lists, auditable IRC logs and a wiki are standard fare for Libre Projects that are simply not normal Industry-Standard practice.
This talk therefore goes through the workflow, from the original HDL through to the GDS-II layout, showing how we were able to keep track of the development that led to the IMEC 180nm tape-out in July 2021. In particular, by following a parallel development process involving "Real" and "Symbolic" Cell Libraries, developed by Chips4Makers, will be shown how our developers did not need to sign a Foundry NDA, but were still able to work side-by-side with a University that did. With this parallel development process, the University upheld their NDA obligations, and Libre-SOC were simultaneously able to honour its Transparency Objectives.
Workload Transformation and Innovations in POWER Architecture Ganesan Narayanasamy
IT Industry is going through two major transformations. One is adaption of AI and tight integration of the same in the commercial applications and enterprise workflow. Two the transformation in software architecture through the concepts like microservices and the cloud native architecture. These transformation alongside the aggressive adaption of IoT/mobile and 5G in all our day today activities is making the world operate in more real time manner which opens-up a new challenge to improve the hardware architecture to adapt to these requirements. These above two major transformation pushes the boundary of the entire systems stack making the designer rethink hardware. This talk presents you a picture of how the enterprise Industry leading POWER architecture is transforming to fulfill the performance demands of these newer generation workloads with primary focus on the AI acceleration on the chip.
July 16th 2021 , Friday for our newest workshop with DoMS, IIT Roorkee, Concept to Solutions using OpenPOWER Stack. It's time to discover advances in #DeepLearning tools and techniques from the world's leading innovators across industries, research, and public speakers.
Register here:
https://lnkd.in/ggxMq2N
This presentation covers two uses cases using OpenPOWER Systems
1. Diabetic Retinopathy using AI on NVIDIA Jetson Nano: The objective is to classify the diabetic level solely on retina image in a remote area with minimum doctor's inference. The model uses VGG16 network architecture and gets trained from scratch on POWER9. The model was deployed on the Jetson Nano board.
1. Classifying Covid positivity using lung X-ray images: The idea is to build ML models to detect positive cases using X-ray images. The model was trained on POWER9, and the application was developed using Python.
IBM Bayesian Optimization Accelerator (BOA) is a do-it-yourself toolkit to apply state-of-the-art Bayesian inferencing techniques and obtain optimal solutions for complex, real-world design simulations without requiring deep machine learning skills. This talk will describe IBM BOA, its differentiation and ease of use, and how researchers can take advantage of it for optimizing any arbitrary HPC simulation.
This presentation covers various partners and collaborators who are currently working with OpenPOWER foundation ,Use cases of OpenPOWER systems in multiple Industries , OpenPOWER Workgroups and OpenCAPI features .
The IBM POWER10 processor represents the 10th generation of the POWER family of enterprise computing engines. Its performance is a result of both powerful processing cores and high-bandwidth intra- and inter-chip interconnect. POWER10 systems can be configured with up to 16 processor chips and 1920 simultaneous threads of execution. Cross-system memory sharing, through the new Memory Inception technology, and 2 Petabytes of addressing space support an expansive memory system. The POWER10 processing core has been significantly enhanced over its POWER9 predecessor, including a doubling of vector units and the addition of an all-new matrix math engine. Throughput gains from POWER9 to POWER10 average 30% at the core level and three-fold at the socket level. Those gains can reach ten- or twenty-fold at the socket level for matrix-intensive computations.
Everything is changing from Health Care to the Automotive markets without forgetting Financial markets or any type of engineering everything has stopped being created as an individual or best-case scenario a team effort to something that is being developed and perfectioned by using AI and hundreds of computers.And even AI is something that we no longer can run in a single computer, no matter how powerful it is. What drives everything today is HPC or High-Performance Computing heavily linked to AI In this session we will discuss about AI, HPC computing, IBM Power architecture and how it can help develop better Healthcare, better Automobiles, better financials and better everything that we run on them
Macromolecular crystallography is an experimental technique allowing to explore 3D atomic structure of proteins, used by academics for research in biology and by pharmaceutical companies in rational drug design. While up to now development of the technique was limited by scientific instruments performance, recently computing performance becomes a key limitation. In my presentation I will present a computing challenge to handle 18 GB/s data stream coming from the new X-ray detector. I will show PSI experiences in applying conventional hardware for the task and why this attempt failed. I will then present how IC 922 server with OpenCAPI enabled FPGA boards allowed to build a sustainable and scalable solution for high speed data acquisition. Finally, I will give a perspective, how the advancement in hardware development will enable better science by users of the Swiss Light Source.
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systemsGanesan Narayanasamy
As the adoption of AI technologies increases and matures, the focus will shift from exploration to time to market, productivity and integration with existing workflows. Governing Enterprise data, scaling AI model development, selecting a complete, collaborative hybrid platform and tools for rapid solution deployments are key focus areas for growing data scientist teams tasked to respond to business challenges. This talk will cover the challenges and innovations for AI at scale for the Industires such as Healthcare and Automotive , the AI ladder and AI life cycle and infrastructure architecture considerations.
This talk gives an introduction about Healthcare Use cases - The AI ladder and Lifestyle AI at Scale Themes The iterative nature of the workflow and some of the important components to be aware in developing AI health care solutions were being discussed. The different types of algorithms and when machine learning might be more appropriate in deep learning or the other way will also be discussed. Use cases in terms of examples are also shared as part of this presentation .
Healthcare has became one of the most important aspects of everyones life. Its importance has surged due to the latests outbreaks and due to this latest pandemic it has become mandatory to collaborate to improve everyones Healthcare as soon as possible.
IBM has reacted quickly sharing not only its knowledge but also its Artificial Intelligence Supercomputers all around the world.
Those Supercomputers are helping to prevail this outbreak and also future ones.
They have completely different features compared to proposals from other players of this Supercomputers market.
We will try to make a quick look at the differences of those AI focused Supercomputers and how they can help in the R&D of Healthcare solutions for everyone, from those ones with access to a big IBM AI Supercomputer to those ones with access to only one small IBM AI focused server.
Healthcare has became one of the most important aspects of everyones life. Its importance has surged due to the latests outbreaks and due to this latest pandemic it has become mandatory to collaborate to improve everyones Healthcare as soon as possible.
IBM has reacted quickly sharing not only its knowledge but also its Artificial Intelligence Supercomputers all around the world.
Those Supercomputers are helping to prevail this outbreak and also future ones.
They have completely different features compared to proposals from other players of this Supercomputers market.
We will try to make a quick look at the differences of those AI focused Supercomputers and how they can help in the R&D of Healthcare solutions for everyone, from those ones with access to a big IBM AI Supercomputer to those ones with access to only one small IBM AI focused server.
Moving object recognition (MOR) corresponds to the localization and classification of moving objects in videos. Discriminating moving objects from static objects and background in videos is an essential task for many computer vision applications. MOR has widespread applications in intelligent visual surveillance, intrusion detection, anomaly detection and monitoring, industrial sites monitoring, detection-based tracking, autonomous vehicles, etc. In this session, Murari provided a poster about the deep learning algorithms to identify both locations and corresponding categories of moving objects with a convolutional network. The challenges in developing such algorithms have been discussed.
The document discusses AI in the enterprise, including use cases, infrastructure considerations, and the AI lifecycle. It provides examples of how AI can be applied in various industries and common patterns of analytics using AI. It also outlines the data science model development workflow and considerations for AI infrastructure, software, and data management throughout the AI lifecycle.
"Making .NET Application Even Faster", Sergey Teplyakov.pptxFwdays
In this talk we're going to explore performance improvement lifecycle, starting with setting the performance goals, using profilers to figure out the bottle necks, making a fix and validating that the fix works by benchmarking it. The talk will be useful for novice and seasoned .NET developers and architects interested in making their application fast and understanding how things work under the hood.
Demystifying Neural Networks And Building Cybersecurity ApplicationsPriyanka Aash
In today's rapidly evolving technological landscape, Artificial Neural Networks (ANNs) have emerged as a cornerstone of artificial intelligence, revolutionizing various fields including cybersecurity. Inspired by the intricacies of the human brain, ANNs have a rich history and a complex structure that enables them to learn and make decisions. This blog aims to unravel the mysteries of neural networks, explore their mathematical foundations, and demonstrate their practical applications, particularly in building robust malware detection systems using Convolutional Neural Networks (CNNs).
Latest Tech Trends Series 2024 By EY IndiaEYIndia1
Stay ahead of the curve with our comprehensive Tech Trends Series! Explore the latest technology trends shaping the world today, from the 2024 Tech Trends report and top emerging technologies to their impact on business technology trends. This series delves into the most significant technological advancements, giving you insights into both established and emerging tech trends that will revolutionize various industries.
Improving Learning Content Efficiency with Reusable Learning ContentEnterprise Knowledge
Enterprise Knowledge’s Emily Crockett, Content Engineering Consultant, presented “Improve Learning Content Efficiency with Reusable Learning Content” at the Learning Ideas conference on June 13th, 2024.
This presentation explored the basics of reusable learning content, including the types of reuse and the key benefits of reuse such as improved content maintenance efficiency, reduced organizational risk, and scalable differentiated instruction & personalization. After this primer on reuse, Crockett laid out the basic steps to start building reusable learning content alongside a real-life example and the technology stack needed to support dynamic content. Key objectives included:
- Be able to explain the difference between reusable learning content and duplicate content
- Explore how a well-designed learning content model can reduce duplicate content and improve your team’s efficiency
- Identify key tasks and steps in creating a learning content model
Keynote : AI & Future Of Offensive SecurityPriyanka Aash
In the presentation, the focus is on the transformative impact of artificial intelligence (AI) in cybersecurity, particularly in the context of malware generation and adversarial attacks. AI promises to revolutionize the field by enabling scalable solutions to historically challenging problems such as continuous threat simulation, autonomous attack path generation, and the creation of sophisticated attack payloads. The discussions underscore how AI-powered tools like AI-based penetration testing can outpace traditional methods, enhancing security posture by efficiently identifying and mitigating vulnerabilities across complex attack surfaces. The use of AI in red teaming further amplifies these capabilities, allowing organizations to validate security controls effectively against diverse adversarial scenarios. These advancements not only streamline testing processes but also bolster defense strategies, ensuring readiness against evolving cyber threats.
The Zaitechno Handheld Raman Spectrometer is a powerful and portable tool for rapid, non-destructive chemical analysis. It utilizes Raman spectroscopy, a technique that analyzes the vibrational fingerprint of molecules to identify their chemical composition. This handheld instrument allows for on-site analysis of materials, making it ideal for a variety of applications, including:
Material identification: Identify unknown materials, minerals, and contaminants.
Quality control: Ensure the quality and consistency of raw materials and finished products.
Pharmaceutical analysis: Verify the identity and purity of pharmaceutical compounds.
Food safety testing: Detect contaminants and adulterants in food products.
Field analysis: Analyze materials in the field, such as during environmental monitoring or forensic investigations.
The Zaitechno Handheld Raman Spectrometer is easy to use and features a user-friendly interface. It is compact and lightweight, making it ideal for field applications. With its rapid analysis capabilities, the Zaitechno Handheld Raman Spectrometer can help you improve efficiency and productivity in your research or quality control workflows.
The History of Embeddings & Multimodal EmbeddingsZilliz
Frank Liu will walk through the history of embeddings and how we got to the cool embedding models used today. He'll end with a demo on how multimodal RAG is used.
DefCamp_2016_Chemerkin_Yury-publish.pdf - Presentation by Yury Chemerkin at DefCamp 2016 discussing mobile app vulnerabilities, data protection issues, and analysis of security levels across different types of mobile applications.
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...Zilliz
Enterprises have traditionally prioritized data quantity, assuming more is better for AI performance. However, a new reality is setting in: high-quality data, not just volume, is the key. This shift exposes a critical gap – many organizations struggle to understand their existing data and lack effective curation strategies and tools. This talk dives into these data challenges and explores the methods of automating data curation.
Choosing the Best Outlook OST to PST Converter: Key Features and Considerationswebbyacad software
When looking for a good software utility to convert Outlook OST files to PST format, it is important to find one that is easy to use and has useful features. WebbyAcad OST to PST Converter Tool is a great choice because it is simple to use for anyone, whether you are tech-savvy or not. It can smoothly change your files to PST while keeping all your data safe and secure. Plus, it can handle large amounts of data and convert multiple files at once, which can save you a lot of time. It even comes with 24*7 technical support assistance and a free trial, so you can try it out before making a decision. Whether you need to recover, move, or back up your data, Webbyacad OST to PST Converter is a reliable option that gives you all the support you need to manage your Outlook data effectively.
It's your unstructured data: How to get your GenAI app to production (and spe...Zilliz
So you've successfully built a GenAI app POC for your company -- now comes the hard part: bringing it to production. Aparavi addresses the challenges of AI projects while addressing data privacy and PII. Our Service for RAG helps AI developers and data scientists to scale their app to 1000s to millions of users using corporate unstructured data. Aparavi’s AI Data Loader cleans, prepares and then loads only the relevant unstructured data for each AI project/app, enabling you to operationalize the creation of GenAI apps easily and accurately while giving you the time to focus on what you really want to do - building a great AI application with useful and relevant context. All within your environment and never having to share private corporate data with anyone - not even Aparavi.
4. BUT DATA WAREHOUSES WERE NOT
BUILT TO HANDLE THIS LEVEL OF DATA
NoSQL & Hadoop GPU Database Relational DB
1970s-1990s 1990-2010
MPP
2005-2010
In-Memory
2010…
Massive Data
Hive
Kinetica
Aerospike
Mongo DB SQREAM DBMapD
MemSQL
VoltDB
DB2 BLU
IBM
Netezza
IBM
Oracle
DB2
Teradata
Vertica Redshift
Exadata
Oracle
Server
SQL
Classic Relational
5. X86 CPU SYSTEMS ARE NOT ADVANCING
PROCESS TAKES A REALLY LONG TIME
3-5 hours30 minutes
Data lake Legacy MPP DB
1000 of CPUs
1-2 hours
BI customersData sources
ETL + Cubes +
aggregation + index
9. SQREAM DB
• Massively parallel engine
• Faster and smaller than CPUs
POWERED
BY GPUs
• Terabytes to petabytes
• Not limited by RAM
• Ingests 3 TB/hr/GPU
• Powerful columnar storage
• Always-on compression
• Familiar ANSI SQL
• Standard connectors
• 100 TB in a 2U server
• Highly cost-efficient
• Python, AI, Jupyter, etc.
• Built for data science
COMPLEMENTS EXISTING INFRASTRUCTURE
MASSIVELY
SCALABLE
SQL
DATABASE
EXTENSIBLE
FOR ML/AI
MINIMAL
FOOTPRINT
LIGHTNING
FAST
10. SCALE-UP SOLUTION
• SQream DB can scale up by expanding the attached storage, or out by adding additional
compute nodes
HP SN6000B 16Gb FC Switch
47434642454144403935383437333632312730262925282423192218211720161511141013912873625140
BI
fabric
Storage
fabric
11. HIGH THROUGHPUT CONVERGED
• SQream DB designed for high-throughput
• IBM Power Systems is the only NVLink
CPU-to-GPU enabled architecture
• IBM AC922, with POWER9 and NVLINK
can transfer data at up to 300GB/s, almost
9.5x faster than PCIe 3.0 found in x86-
based architectures, reducing classic I/O
bottlenecks
2x
NVIDIA
Tesla V100
2x
NVIDIA
Tesla V100
IBM
Power 9
IBM
Power 9
12. GPU-ACCELERATED DATA WAREHOUSE SQREAM DB
BOOSTS QUERY PERFORMANCE BY UP TO 150% FOR
IBM POWER9 USERS
“GPU-accelerated analytics are an increasingly important part of our
industry. The announcement of SQream on the IBM POWER9 platform takes
this concept to another level of performance, as the POWER9 CPU with
embedded NVIDIA NVLink interface to NVIDIA’s GPUs allows SQream to
enable even faster processing of data on POWER9 servers.”
Sumit Gupta, VP of HPC and AI for IBM Cognitive Systems
13. HIGH THROUGHPUT ARCHITECTURE
IT’S NOT JUST THE CORES
RAM
Power9
CPU
Tesla V100
GPU
VRAM
Tesla V100
GPU
VRAM
170GB/s per CPU
NVLink – 300GB/s BiDi
900GB/s
RAM
Power9
CPU
Tesla V100
GPU
VRAM
Tesla V100
GPU
VRAM
IBM SMP bus
14. UP TO 2x FASTER LOADING
SQREAM DB ON POWER9
• SQream DB relies on CPU as well as GPUs
for loading
• IBM’s Power9 multi-core architecture makes
loading much faster than comparable x86
based systems
• IBM Power9 system loaded data nearly
twice as fast as the x86 based machine
IBM Power9 AC922:
2x POWER9 16C @ 3.8GHz | 256 GB DDR4 2666 MHz | SSD storage | 4x NVIDIA Tesla V100 (SXM2 NVLINK - 16GB)
Dell PowerEdge R740:
2x Intel Xeon Silver 4112 CPU @ 2.60GHz | 256GB DDR4 2666MHz | SSD storage | 4x NVIDIA Tesla V100 (PCIe - 16GB)
1,929
1,094
-
500
1,000
1,500
2,000
2,500
Load Time (seconds)
LoadTime(seconds)
Lowerisbetter
Load time for 6 billion TPC-H records
Dell Poweredge R740 IBM Power9 AC922
15. UP TO 3.7x FASTER QUERIES
SQREAM DB ON POWER9
• SQream DB on Power9 is
between 150% to 370% faster
than comparable x86
architectures
• The CPU-GPU NVLink bandwidth
is key to performance in complex
queries
IBM Power9 AC922:
2x POWER9 16C @ 3.8GHz | 256 GB DDR4 2666 MHz | SSD storage | 4x NVIDIA Tesla V100 (SXM2 NVLINK - 16GB)
Dell PowerEdge R740:
2x Intel Xeon Silver 4112 CPU @ 2.60GHz | 256GB DDR4 2666MHz | SSD storage | 4x NVIDIA Tesla V100 (PCIe - 16GB)
52.83
10.35
84.5
78.57
14.06
2.8
30.29 29.01
0
10
20
30
40
50
60
70
80
90
TPC-H Query 8 TPC-H Query 6 TPC-H Query 19 TPC-H Query 17
Querytime(seconds)
Lowerisbetter
Query
SQream DB performance
IBM Power9 vs Intel Xeon (Skylake)
Dell PowerEdge R740 IBM Power9 AC922
16. DATA EXPLORATION
MADE EASY
Query raw data directly
Immediate ad-hoc querying
Ideal for data science and discovery
Multiple
JOINs on
any field
Time
Series
Regular
Expressions
ANSI-92
Compatible
Window
Analysis
ODBC, JDBC
Python
Connectivity
17. HOW IT WORKS
Chunking
Data Data Data
Automatic adaptive
compression
Data Data Data
GPU
Parallel chunk
processing
Data Skipping
Data Data Data
Columnar process
+ Metadata tagging
Data DataDataData
Raw data
Data Data Data
Data Data Data
Data Data Data
18. 18
CONCEPT 1
• Columnar databases are very common,
efficient for analytics
• Good for big data analysis -
aggregations over days, per accounts
• Columnar databases compress data
better because of the higher data locality
COLUMNAR
19. 19
CONCEPT 2
SQream DB tables enable scalability by partitioning data in multiple dimensions.
We call this chunking. Chunking is automatically and transparently performed during ingest.
CHUNKING
Table
Chunks
Columns
20. 20
CONCEPT 3
• Always on, calculated for every chunk
• Example:
SELECT * FROM t WHERE YEAR>2017
(all chunks with YEAR<=2017 can be skipped)
ZONE MAPS
day month year val1 val2 val3
10 2017
11 2017
12 2017
01 2018
02 2018
03 2018
Only this
will be read
This is
skipped
Automatic, transparent index replacement
25. OF PERFORMANCE
MEDIA
CUT THE COST
4x NVIDIA Tesla GPUs
512 GB RAM + iSCSI JBOD (20TB)
X86 Dell C4130
8 full 42U racks,
56 S-Blades 7 TB RAM
Compression ratio
Netezza
Ownership Cost
33.70 Average query time
(seconds)
Processing Units
(S-Blade / GPUs)
4.0
56
$12,000,000
31.70
4.7
4
$500,000
ACV calculation on 24 TB of data, 300B rows, 8 tables with complex, nested joins
26. FEEL FREE TO
ADDRESS
Headquarters, 7 WTC
250 Greenwich Street
New York, New York
David Garber, Sales Manager, West
davidg@sqream.com | sqream.com
WE ARE SOCIAL
CONTACT