Machine Learning on clinical datasets to predict the risk of chronic disease conditions like Type 2 Diabetes mellitus beforehand; as well as predicting outcomes like hospital readmission using EMR RWE data.
Healthcare stands to gain significant ground with the help of domain-specific AI capabilities that were historically powered by humans. As a result, the next generation of healthcare has already begun, and it’s being revolutionized by AI.
Machine learning with Big Data power point presentationDavid Raj Kanthi
This is an article made form the articles of IEEE published in the year 2017
The following presentation has the slides for the Title called the
Machine Learning with Big data. that following presentation which has the challenges and approaches of machine learning with big data.
The integration of the Big Data with Machine Learning has so many challenges that Big data has and what is the approach made by the machine learning mechanism for those challenges.
Computational approaches using AI are being used to speed up drug discovery and clinical trials in the following ways:
(1) AI is being applied to large datasets to help identify new biomarkers and repurpose existing drugs, with the global AI healthcare market expected to reach $36.1 billion by 2025.
(2) Major pharmaceutical companies are collaborating and sharing data using AI to accelerate target identification and automate molecule design.
(3) Startups are generating huge image datasets from high-throughput drug screening experiments to help identify new drug candidates in areas like oncology.
(4) AI can help improve clinical trials by identifying best patient populations, enabling dynamic trial design adjustments, and improving patient access and
Prediction of heart disease using machine learning.pptxkumari36
1. The document discusses using machine learning techniques to predict heart disease by evaluating large datasets to identify patterns that can help predict, prevent, and manage conditions like heart attacks.
2. It proposes using data analytics based on support vector machines and genetic algorithms to diagnose heart disease, claiming genetic algorithms provide the best optimized prediction models.
3. The key modules described are uploading training data, pre-processing the heart disease data, using machine learning to predict heart disease, and generating graphical representations of the analyses.
Artificial Intelligence in Health Care 247 Labs Inc
This presentation was shown at the Artificial Intelligence in Health Care event in Toronto Nov 16 2017. The discussion was to introduce various applications of artificial intelligence and machine learning in the health care field.
Artificial intelligence can help improve healthcare in several ways:
1. It can help doctors make more accurate diagnoses by analyzing large amounts of medical data.
2. AI is already being used in areas like radiology to identify diseases in medical images.
3. It shows promise in personalized treatment recommendations by analyzing individual patient data.
4. In the future, AI may be able to perform some medical tasks like surgery more precisely than humans.
A STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUESijcsitcejournal
Optical Character Recognition (OCR) is the process which enables a system to without human intervention
identifies the scripts or alphabets written into the users’ verbal communication. Optical Character
identification has grown to be individual of the mainly flourishing applications of knowledge in the field of
pattern detection and artificial intelligence. In our survey we study on the various OCR techniques. In this
paper we resolve and examine the hypothetical and numerical models of Optical Character Identification.
The Optical character identification or classification (OCR) and Magnetic Character Recognition (MCR)
techniques are generally utilized for the recognition of patterns or alphabets. In general the alphabets are
in the variety of pixel pictures and it could be either handwritten or stamped, of any series, shape or
direction etc. Alternatively in MCR the alphabets are stamped with magnetic ink and the studying machine
categorize the alphabet on the basis of the exclusive magnetic field that is shaped by every alphabet. Both
MCR and OCR discover utilization in banking and different trade appliances. Earlier exploration going on
Optical Character detection or recognition has shown that the In Handwritten text there is no limitation
lying on the script technique. Hand written correspondence is complicated to be familiar through due to
diverse human handwriting style, disparity in angle, size and shape of calligraphy. An assortment of
approaches of Optical Character Identification is discussed here all along through their achievement.
Artificial intelligence has great potential to revolutionize healthcare. It can help predict ICU transfers and hospital readmissions by identifying at-risk patients from their medical data. AI is also used in medical testing through new methods like bloodless blood testing using smartphone ECGs. It improves clinical workflows by reducing physician burnout through tools like vein finders. AI helps prevent infections by monitoring patients for early signs of sepsis or other healthcare-acquired infections. During the COVID-19 pandemic, AI has assisted with tracking and forecasting outbreaks, diagnosing patients, processing health claims, and developing new drugs to treat the virus.
Artificial intelligence (AI) is already transforming healthcare. It's an invaluable tool, capable of storing and processing vast amounts of data almost simultaneously. AI allows for rapid and accurate diagnosis, early detection, advanced research and much more.
Intro/Overview on Machine Learning PresentationAnkit Gupta
This document provides an overview of a presentation on machine learning given at Gurukul Kangri University in 2017. It defines machine learning as a field that allows computers to learn without being explicitly programmed. It discusses different machine learning algorithms including supervised learning, unsupervised learning, and semi-supervised learning. Examples of applications of machine learning discussed include data mining, natural language processing, image recognition, and expert systems. The document also contrasts artificial intelligence, machine learning, and deep learning.
We are predicting Heart Disease by Taking 14 Medical Parameters as an inputs through 2 data Minning Techniques(Decision Tree(Faster) And KNN neighbour Algorithms(Slower)).
And Visualizing The dataset.If the output 1 then it means Higher Chances of getting Heart Attack ,if 0 then it means Less chances of Heart Attack.
Diabetes Prediction Using Machine Learningjagan477830
Our proposed system aims at Predicting the number of Diabetes patients and eliminating the risk of False Negatives Drastically.
In proposed System, we use Random forest, Decision tree, Logistic Regression and Gradient Boosting Classifier to classify the Patients who are affected with Diabetes or not.
Random Forest and Decision Tree are the algorithms which can be used for both classification and regression.
The dataset is classified into trained and test dataset where the data can be trained individually, these algorithms are very easy to implement as well as very efficient in producing better results and can able to process large amount of data.
Even for large dataset these algorithms are extremely fast and can able to give accuracy of about over 90%.
This document provides an overview of machine learning and artificial intelligence concepts. It discusses what machine learning is, including how machines can learn from examples to optimize performance without being explicitly programmed. Various machine learning algorithms and applications are covered, such as supervised learning techniques like classification and regression, as well as unsupervised learning and reinforcement learning. The goal of machine learning is to develop models that can make accurate predictions on new data based on patterns discovered from training data.
This document provides an overview of artificial intelligence and its applications in healthcare. It begins with definitions of AI and machine learning. It then reviews the history of AI from ancient times to recent developments. Current uses of AI in healthcare discussed include predictive analytics, disease detection via pattern recognition, patient self-monitoring, and scheduling. Barriers to the adoption of AI in healthcare and future applications are also mentioned.
How can we make a Radiologist more efficient?
Increased Imaging for Chronic Diseases and Emergencies raise the demand for radiologists globally & AI could definitely assist them in increasing their efficiency & meet the requirements.
Diabetes prediction using machine learningdataalcott
This document discusses a proposed system to classify and predict diabetes using machine learning and deep learning algorithms. The objectives are to classify the PIMA Indian diabetes dataset and design an interactive application where users can input data to get a prediction. The proposed system uses support vector machine (SVM) for machine learning and neural networks for deep learning. It aims to improve accuracy over existing systems by using deep learning techniques. The methodology involves collecting a dataset, preprocessing, splitting for training and testing, applying algorithms, and evaluating results.
This document provides an introduction to artificial intelligence (AI) including definitions, goals, branches, and applications. It defines AI as computers with the ability to mimic human intelligence through learning from experience and handling complex problems. The main goals of AI are to better understand human intelligence by writing programs that emulate it and to create useful programs to do tasks normally requiring human experts. Branches of AI discussed include vision systems, learning systems, robotics, expert systems, and neural networks. The document also outlines some present and future aspects of AI as well as ethics and risks.
Multi Disease Detection using Deep LearningIRJET Journal
1) The document proposes a system for multi-disease detection using deep learning that could provide early detection of chronic diseases like heart disease, cancer, and diabetes from medical data and save lives.
2) It reviews literature on disease prediction using machine learning algorithms like CNN, KNN, decision trees, and support vector machines. CNN showed slightly better accuracy than KNN for general disease detection.
3) The proposed system would use deep learning models to detect and classify diseases from medical images and data with high accuracy, helping doctors verify test results and enhancing their experience with diseases. It aims to reduce the costs of diagnostic testing for chronic conditions.
Data Science Deep Roots in Healthcare IndustryDinesh V
Data Science transforms the healthcare industry with impeccable solutions that can improve patient care through EHRs, medical imaging, drug discovery, predictive medicines and genetics and genomics.
Basics of Information support of the hospitalEneutron
Telemedicine involves using technology to provide medical services from a distance. It includes teleconsultations, teleeducation, mobile medical services, remote patient monitoring, and telesurgery. Screening in various medical fields helps detect diseases early through simple and standardized tests. This allows for preventive measures that can improve health outcomes. Information systems also support doctors by providing medical information and decision support. They help increase the quality of diagnosis and treatment.
Oscar Rodríguez-El impacto de las ciencias ómicas en la medicina, la nutrició...Fundación Ramón Areces
El 29 de marzo de 2016 celebramos un Simposio Internacional sobre el 'Impacto de las ciencias ómicas en la medicina, nutrición y biotecnología'. Organizado por la Fundación Ramón Areces en colaboración con la Real Academia Nacional de Medicina y BioEuroLatina, abordó cómo un mejor conocimiento del genoma humano está permitiendo notables avances hacia una medicina de precisión.
Connected Health & Me - Matic Meglic - Nov 24th 2014ipposi
This document discusses how data sharing is changing healthcare by empowering patients. It outlines a shift from a traditional care model, where patients are passive recipients of care, to one where patients are engaged and empowered through access to their own health data and contextual knowledge. Key drivers of this change include affordable technology, the quantified self-movement, big data, and empowered patients. The document discusses how patient registries and personalized medicine can utilize data to better understand treatment efficacy for similar patients and provide personalized care plans. It also notes challenges around data privacy and the need for guidelines. Overall, the document advocates for empowering patients through access to their own health data while using data and technology to coordinate and improve healthcare.
This document discusses the importance of electronic health records and clinical decision support systems for improving healthcare quality and reducing costs and errors. It notes that healthcare information is essential for providing and managing patient care. Clinical decision support systems can help ensure best practices are followed and reduce unnecessary tests and costs. However, the document also finds that healthcare practices still vary greatly between regions and clinicians due to complexity, uncertainty and lack of evidence. More high-quality data and decision support are needed to address these issues and improve consistent high-value care.
This document discusses patient generated data (PGD) and how mobile health (mHealth) technologies can be used to capture it. PGD includes data recorded by patients about their health symptoms, medication adherence, biometric data from wearables, and patient reported outcomes. The document outlines how PGD can help with clinical trials and care by providing more comprehensive real-world data. Challenges with PGD like data quality, privacy and regulatory issues are discussed. The document provides examples of how the Aparito platform captures different types of PGD through mobile apps and connected devices to improve disease understanding and drug development.
Detection of myocardial infarction on recent dataset using machine learningIJICTJOURNAL
In developing countries such as India, with a large aging population and limited access to medical facilities, remote and timely diagnosis of myocardial infarction (MI) has the potential to save the life of many. An electrocardiogram is the primary clinical tool utilized in the onset or detection of a previous MI incident. Artificial intelligence has made a great impact on every area of research as well as in medical diagnosis. In medical diagnosis, the hypothesis might be doctors' experience which would be used as input to predict a disease that saves the life of mankind. It is been observed that a properly cleaned and pruned dataset provides far better accuracy than an unclean one with missing values. Selection of suitable techniques for data cleaning alongside proper classification algorithms will cause the event of prediction systems that give enhanced accuracy. In this proposal detection of myocardial infarction using new parameters is proposed with increased accuracy and efficiency of the existing model. Additional parameters are used to predict MI with more accuracy. The proposed model is used to predict an early diagnosis of MI with the help of expertise experiences and data gathered from hospitals.
Genomics, Personalized Medicine and Electronic Medical RecordsLyle Berkowitz, MD
We are now unlocking the secrets of health at a molecular level – which includes not only why some people get diseases, but also how to prevent or cure them. However, as Osler points out, knowing this information is only valuable in the context of making it available for the right patient at the right time.
This presentation provides a basic introduction to genomic or personalized medicine, and discusses how this information can and should be integrated into our electronic medical record systems.
These slides were originally presented at the HIMSS Annual Conference in February of 2007.
Multiple Disease Prediction System: A ReviewIRJET Journal
This document discusses a study analyzing the use of machine learning techniques to predict multiple diseases based on user-inputted symptoms in a multi-disease prediction system. The system employs predictive modelling and examines symptoms to determine potential illnesses and their likelihood. The study focuses on predicting common diseases like diabetes, heart disease, breast cancer, hepatitis, and kidney disease. It evaluates various machine learning algorithms and their ability to accurately predict these diseases from pre-processed healthcare data.
This document discusses using ontologies to simplify semantic solutions for biomedical applications. It provides examples of how ontologies can be used to integrate medical expertise and knowledge from different sources. It also describes challenges in representing biomedical information with ontologies and introduces MedMaP, a medical management portal that aims to simplify access to ontology-based reasoning and analytics using graphical visualizations and self-service tools. MedMaP allows users to customize their experience and gain insights from subject matter experts.
How predictive analytics can help find the rare disease patientIMSHealthRWES
This document discusses how predictive analytics using real-world data can help identify undiagnosed rare disease patients. It describes two case studies: 1) A screening algorithm identified potentially undiagnosed patients for a rare multi-system disease with a high risk prevalence of 20.5% compared to 0.7% of the population. 2) An analysis of a rare cardiac disease identified health system barriers like variability between diagnostic centers that could cause under diagnosis. While initial results are promising, challenges remain around data privacy, sample size, and clinician adoption of screening algorithms.
Therapeutic management of diseases based on fuzzy logic system- hypertriglyce...TELKOMNIKA JOURNAL
The support systems for assisting clinical decision highly improve the quality and efficiency of the therapeutic and diagnostic treatment in medicine. The proper implementation of such systems can emulate the reasoning of health care professionals in such a way that suggest reasonable decisions on patient treatment. The fuzzy logic system can be considered as one of the efficient techniques for converting a complex decision tree that usually facing the physician into artificial intelligent procedure embedded in a computer program. So many properties in fuzzy logic system that can facilitate the process of medical diagnosis and therapeutic management. In this paper, a system for therapeutic management of hypertriglyceridemia was efficiently realized using a fuzzy logic technique. The obtained results had shown that the proposed fuzzy logic contributes a reliable managing procedure for assisting the physicians and pharmacist in treating the hypertriglyceridemia. Many different hypertriglyceridemia treatment cases showed a perfect matching decision between the standard guidelines and that given by the proposed system.
K-Nearest Neighbours based diagnosis of hyperglycemiaijtsrd
This document summarizes a research paper that developed an artificial intelligence system using the K-nearest neighbors algorithm to diagnose hyperglycemia (high blood sugar). The system was trained on a database of 415 patient cases characterized by 10 physiological parameters. It achieved a diagnostic accuracy of 91% compared to medical experts when tested on new patient data. The authors conclude the KNN-based system is useful for diabetes diagnosis and could help supplement medical doctors, especially in remote areas with limited access to experts.
This document presents a health analyzer system that uses machine learning to predict multiple diseases from user-input data. The system was designed to predict diabetes, stroke, breast cancer, fetal health, liver disease, and heart disease. It uses various machine learning algorithms like random forest, SVM, logistic regression, naive bayes and decision trees. Models for each disease were trained on different datasets and the best performing algorithm was selected for each disease. A Flask API with user interfaces was created to allow users to input data and receive predictions. The system aims to provide a cost-effective solution compared to separate systems for each disease. It analyzes diseases by considering all relevant parameters to detect effects more accurately.
Electronic Medical Records: From Clinical Decision Support to Precision MedicineKent State University
This document discusses the transition from traditional clinical decision support using electronic medical records to precision medicine. It provides examples of how Cleveland Clinic has used electronic medical records to create registries for conditions like chronic kidney disease, develop predictive models, and power algorithms for precision treatment recommendations. The document envisions precision medicine relying on vast amounts of molecular, genomic, and patient-reported data integrated into clinical decision support.
IRJET- Diabetes Prediction by Machine Learning over Big Data from Healthc...IRJET Journal
This document discusses using machine learning techniques to predict diabetes based on healthcare data. It proposes using preprocessing, K-means clustering, and support vector machine (SVM) classification. Preprocessing cleans and structures the data. K-means clusters the data into groups. SVM classification then predicts whether patients are diabetic or non-diabetic, aiming for a prediction accuracy of 94.9%. The techniques aim to allow for early diabetes prediction using a combination of machine learning methods on both structured and unstructured healthcare data.
An AI-based Decision Platform built using unified data model, incorporating systems biology topics for unit analysis using semi-supervised learning models
Predicting disease from several symptoms using machine learning approach.IRJET Journal
This document discusses predicting disease from symptoms using machine learning. It proposes using algorithms like KNN, SVM, NB, DT, RF and LR to build a model for disease prediction. The KNN algorithm achieved the highest accuracy of 98.36% on a dataset containing symptoms and medical histories of 4920 patients with 41 different diseases. The goal is to develop a multi-disease prediction system using machine learning to help doctors make earlier and more accurate diagnoses to improve patient outcomes. Future work will focus on expanding the dataset and improving algorithms to increase prediction accuracy.
Similar to Predictive Analytics and Machine Learning for Healthcare - Diabetes (20)
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and TuningDonghwan Lee
이 세션에서는 SageMaker Training Jobs / SageMaker Jumpstart를 사용하여 Foundation Model 을 Pre-Triaining 하거나 Fine Tuing 하는 방안을 제시합니다. 이 세션을 통해 아래 3가지가 소개됩니다.
1. 파운데이션 모델을 처음부터 Training
2. 오픈 소스 모델을 사용하여 파운데이션 모델을 Pre-Training
3. 도메인에 맞게 모델을 Fine Tuning하는 방안
발표자:
Miron Perel, Principal ML GTM Specialist, AWS
Kristine Pearce, Principal ML BD, AWS
How We Added Replication to QuestDB - JonTheBeachjavier ramirez
Building a database that can beat industry benchmarks is hard work, and we had to use every trick in the book to keep as close to the hardware as possible. In doing so, we initially decided QuestDB would scale only vertically, on a single instance.
A few years later, data replication —for horizontally scaling reads and for high availability— became one of the most demanded features, especially for enterprise and cloud environments. So, we rolled up our sleeves and made it happen.
Today, QuestDB supports an unbounded number of geographically distributed read-replicas without slowing down reads on the primary node, which can ingest data at over 4 million rows per second.
In this talk, I will tell you about the technical decisions we made, and their trade offs. You'll learn how we had to revamp the whole ingestion layer, and how we actually made the primary faster than before when we added multi-threaded Write Ahead Logs to deal with data replication. I'll also discuss how we are leveraging object storage as a central part of the process. And of course, I'll show you a live demo of high-performance multi-region replication in action.
Airline Satisfaction Project using Azure
This presentation is created as a foundation of understanding and comparing data science/machine learning solutions made in Python notebooks locally and on Azure cloud, as a part of Course DP-100 - Designing and Implementing a Data Science Solution on Azure.
Applications of Data Science in Various IndustriesIABAC
The wide-ranging applications of data science across industries.
From healthcare to finance, data science drives innovation and efficiency by transforming raw data into actionable insights.
Learn how data science enhances decision-making, boosts productivity, and fosters new advancements in technology and business. Explore real-world examples of data science applications today.
Amazon DocumentDB(MongoDB와 호환됨)는 빠르고 안정적이며 완전 관리형 데이터베이스 서비스입니다. Amazon DocumentDB를 사용하면 클라우드에서 MongoDB 호환 데이터베이스를 쉽게 설치, 운영 및 규모를 조정할 수 있습니다. Amazon DocumentDB를 사용하면 MongoDB에서 사용하는 것과 동일한 애플리케이션 코드를 실행하고 동일한 드라이버와 도구를 사용하는 것을 실습합니다.
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...javier ramirez
Los sistemas distribuidos son difíciles. Los sistemas distribuidos de alto rendimiento, más. Latencias de red, mensajes sin confirmación de recibo, reinicios de servidores, fallos de hardware, bugs en el software, releases problemáticas, timeouts... hay un montón de motivos por los que es muy difícil saber si un mensaje que has enviado se ha recibido y procesado correctamente en destino. Así que para asegurar mandas el mensaje otra vez.. y otra... y cruzas los dedos para que el sistema del otro lado tenga tolerancia a los duplicados.
QuestDB es una base de datos open source diseñada para alto rendimiento. Nos queríamos asegurar de poder ofrecer garantías de "exactly once", deduplicando mensajes en tiempo de ingestión. En esta charla, te cuento cómo diseñamos e implementamos la palabra clave DEDUP en QuestDB, permitiendo deduplicar y además permitiendo Upserts en datos en tiempo real, añadiendo solo un 8% de tiempo de proceso, incluso en flujos con millones de inserciones por segundo.
Además, explicaré nuestra arquitectura de log de escrituras (WAL) paralelo y multithread. Por supuesto, todo esto te lo cuento con demos, para que veas cómo funciona en la práctica.
2. DATA ANALYTICS IN HEALTHCARE & LIFE SCIENCES
1. VITAL BUSINESS PROBLEMS:
So many different problems exist and they are of varying degree of complexity:
- What impacts favorable clinical outcomes
- Drivers of adverse events
- Factors impacting cost of care
- Earlier diagnosis of cancers and chronic diseases
Understanding these different business problems is critical for generating
possible solutions
2. POTENTIAL DATA SOURCES:
Huge amounts of data is getting generated nowadays from different sources that
are capable of capturing information :
- Electronic Health Records
- Healthcare claims from Insurance companies
- Pharmacies – claims and medication reviews
- Lab tests and Imaging results
- Population health data – Social Determinants of Health
- Genomics (and later Proteomics and Metabolomics)
- Wearable and other devices
- Other sources (Surveys, Patient Reported Outcomes)
The volume, velocity, variety, and veracity that is getting generated is staggering
– typical Big Data problem.
3. DATA PROCESSING, MANAGEMENT AND ANALYSIS:
Making sense of these varied sources of data and processing them so that they are useful for analysis is a data engineering challenge.
Structured data needs to be cleaned and curated; data from different sources need to be matched to get a complete 360 degree view of the customer.
Semi-structured and unstructured data sources (Physician notes, imaging data) pose challenges to curate and store the information so that it can be retrieved and
analyzed at scale and speed.
Various Big Data technologies have been developed to tackle this problem of storing(HADOOP ecosystem, SPARK) and analyzing semi-structured and unstructured data
(Text mining, NLP, Deep Learning for Image and Video Analytics).
4. SOLUTIONS TO THE PROBLEMS:
At the end of the day, all the analysis should be able to generate actionable insights. Interpretation of the results and their implementation to solve the problem are key.
3. HOW ML/DL CAN AUGMENT THE DECISION MAKING
PROCESS FOR CLINICIANS
PROGNOSIS
•A machine-learning
model can learn the
patterns of health
trajectories of vast
numbers of patients.
This facility can help
physicians to
anticipate future
events at an expert
level, drawing from
information well
beyond the
individual physician’s
practice experience.
For example, how
likely is it that a
patient will be able
to return to work, or
how quickly will the
disease progress?
DIAGNOSIS
•A diagnostic error
will occur in the
care of nearly every
patient in his or her
lifetime, and
receiving the right
diagnosis is critical
to receiving
appropriate care.
This problem is not
limited to rare
conditions. Cardiac
chest pain, TB,
dysentery, and
complications of
childbirth are
commonly not
detected even in
developing
countries
TREATMENT
•In a large health
care system with
tens of thousands of
physicians treating
tens of millions of
patients, there is
variation in when
and why patients
present for care and
how patients with
similar conditions
are treated. Can a
model sort through
these natural
variations to help
physicians identify
when the collective
experience points to
a preferred
treatment pathway?
CLINICALWORKFLOW
•The same machine-
learning techniques
that are used in
many consumer
products can be
used to make
clinicians more
efficient. Machine
learning that drives
search engines can
help expose reqd.
.information in a
patient’s chart for a
clinician without
multiple clicks.
Data entry of forms
and text fields can
be improved with
the use of machine-
learning
techniques.
REMOTEAREAS
•There is no way for
physicians to
individually interact
with all the patients
who may need care.
Can machine learning
extend the reach of
clinicians to provide
expert-level medical
assessment without
involvement? For
example, patients
with new rashes may
be able to obtain a
diagnosis by sending
a picture that they
take on their
smartphones,
thereby averting
unnecessary urgent-
care visits.
REFERENCE: https://www.nejm.org/doi/full/10.1056/NEJMra1814259
4. COMPONENTS OF ELECTRONIC HEALTH RECORDS
EMR
DEMOG &
HISTORY
DRUGS
ALLERGIES
VISITS
ADMISSIONS
DIAGNOSES
LAB
RESULTS
PROCEDURE
ADDITIONAL DATA FACTORS (normally not present)
GENOMICS
SOCIAL DETERMINANTS OF HEALTH
IMAGING DATA – X-RAY/USG/CT/MRI
PATIENT REPORTED OUTCOMES - PRO
STANDARD EMR/EHR DATA COMPONENTS
DEMOGRAHICS – Age, Gender, Race, Language, Religion, Insurance, Location
CLINICAL HISTORY – Habits, Past Dx and Observations
MEDICATIONS – Drug NDC, Quantity, Refills, Route, Rx dates
FOOD AND DRUG ALLERGIES – Allergen, Reaction Desc., Severity, Dates
VISITS TO ER AND OPD – Date/Time, Encounter Type, Provider Info
INPATIENT ADMISSIONS – Date/Time, Source, Discharge Code
PRIMARY DIAGNOSES AND COMORBIDITIES – ICD9/10, SNOMED
PROCEDURES AND SURGERIES – Procedure codes and ICD codes
LABORATORY RESULTS – LOINC, Date/Time, Reference Range, Value, UOM
Standard dictionaries: ICD9/10, SNOMED-CT, NDC, LOINC, NPI
GENOMICS IMAGING SDoH OUTCOMES
5. DIABETES – THE MAGNITUDE OF THE PROBLEM
Diabetes is the world's
eighth biggest killer,
accounting for some 1.5
million deaths each year. A
major new World Health
Organization report has
now revealed that the
number of cases around the
world has nearly
quadrupled to 422 million
in 2014 from 108 million in
1980. The Eastern-
Mediterranean region had
the biggest increase in cases
during that time frame.
Diabetes now affects one in
11 adults with high blood
sugar levels linked to 3.8
million deaths every year.
REFERENCE:
https://www.statista.com/chart/4617/the-
unrelenting-global-march-of-diabetes/
6. WHAT HAPPENS IN DIABETES MELLITUS
• https://youtu.be/qn2dhw0NJxo
Type 1 diabetes (T2DM)
In people with type 1 diabetes, the
body does not make insulin. The
immune system attacks and destroys
the cells in the pancreas that make
insulin. Type 1 diabetes is usually
diagnosed in children and young
adults, although it can appear at any
age. People with type 1 diabetes need
to take insulin every day to stay alive.
Type 2 diabetes (T1DM)
In people having type 2 diabetes, the
body does not make or use insulin
well. It can develop diabetes at any
age, even during childhood. However,
this type of diabetes occurs most often
in middle-aged and older people. Type
2 is the most common type of
diabetes.
COURTESY: NIDDK
https://www.niddk.nih.gov/health-
information/diabetes/overview/what-is-diabetes
IMAGE COURTESY: KHAN ACADEMY
7. HOW MACHINE LEARNING CAN HELP IN DIABETES
Predicting risk of heart failure for
diabetes patients with help from
machine learning
Identification of Type 2 Diabetes
Risk Factors Using Phenotypes
Consisting of Anthropometry and
Triglycerides based on Machine
Learning
Use of a Machine Learning
Algorithm Improves Prediction of
Progression to Diabetes
Predicting Future Glucose
Fluctuations Using Machine
Learning and Wearable Sensor Data
Predicting Diabetes Mellitus With
Machine Learning Techniques
Machine-learning to stratify
diabetic patients using novel
cardiac biomarkers and integrative
genomics
Predicting diabetic retinopathy and
identifying interpretable biomedical
features using machine learning
algorithms
Impact of HbA1c Measurement on
Hospital Readmission Rates:
Analysis of 70,000 Clinical Database
Patient Records
Data-Driven Blood Glucose Pattern
Classification and Anomalies
Detection: Machine-Learning
Applications in Type 1 Diabetes
8. APPROACH FOR DM READMISSION PREDICTIVE MODEL
• DMT2 risk prediction using clinical data and statistical and machine learning
algorithms/models
8
Predictor Variables (total 44 variables)
Demographic
Age
Gender
Ethnicity
Diagnosis
Type of Condition(DM T1/T2) diagnosis
# of comorbidities
Position (primary, secondary, etc.) of
diagnosis
Encounter
IP, OP, AE visits
Medications
Dosage, frequency, route
Lab results
Test names, dates, UOM, value
Normal/abnormal result
Admission
Length of stay
Admission method (elective, non-
elective)
Discharge destination
Procedure
Count of procedures
Cost of procedures
Response Variable
Readmission within 30 days
INPUT MODEL OUTPUT
4 years 1 year
Observation
window
Performance
window
Validation
window
Data split into time windows1
2 Models built using following algorithms (data from
observation and performance windows)
Logistic regression model (LOG)
Decision tree model (DT)
Random forest model (RF)
Model Ensembles
3 In-time validation (within performance window)
48.6%
74.3%
34.9%
29.4%
37.3%
68.7%
38.5%
28.2%
53.5%
76.7%
39.8%
33.7%
GINI AUC KS WORST
DECILE
CAPTURELOG DT RF
4 Out-of-time validation (in validation window)
All three models provided accuracy of
~80% in out-of-time validation scenario
RF model with ~76% AUC indicates reasonably good fit
Significant variables (major
drivers of readmission)
SEVERITY OF DM
# of DM spells in past 1 year
ED LOS in past 1 year
# of procedures undergone
# of OPD visits in past 1 year
# of ED visits in past 1 year
# of IP visits in past 1 year
# of comorbidities
Distance from hospital
DM LOS in past 1 year
Time since last ED visit
Total ED cost in past 1 year
Age of patient
Patient category based on
risk score
HighLow
5
6
9. 9
RISK PREDICTION MODEL: DESIGN, EVALUATION
• Mean/Median
• Regression
• KNN
Missing
imputation
• Feature Imp
• RFE
• WoE and IV
Feature
Selection
• Tree based
(DT, RF, GBT)
• Others (SVM,
NN, NB)
Model
Build
• K-fold cross
validation
• ROC curve
Model
Evaluation
Patient cohorts are created based on ICD 9/10 codes for defined chronic disease (e.g. DMT2) and also on the time of
diagnosis to separate already diagnosed patients from those who will potentially develop the disease.
Prospective
Cohort -
Scoring
Dataset
Feature selection
mechanisms help to
focus on the most
important variables
which the outcome
variable – methods
mentioned above
have been used.
EMR data has many
dimensions and this
also means lot of
values are missing –
imputation methods
help keep most of
the features usable.
The basic task is
classification which
is done by
computing the
probability of
outcome at each
patient level and
then applying
thresholds.
Multiple models
were created and
then validated for
accuracy metrics to
select the best
model. Cross
validation and area
under ROC curve
utilized.
Scoring was done
on the prospective
cohort to group
patients into high
risk, medium risk
and low risk. High
risk group was to be
targeted for
interventions.
10. PRACTICAL USE CASE AND CODE DEMO
USE CASE
DATASET
• Risk Prediction for Diabetes
• Impact of HbA1c Measurement on Hospital Readmission Rates:
Analysis of Clinical Database Patient Records
UCI MACHINE LEARNING REPOSITORY - Description
100000 T2DM patients from 30 hospitals; CERNER HEALTH FACTS
OUTCOME
• How likely is a patient to be diagnosed with DM in near future?
• How likely is a T2DM patient to come back to the hospital, before
30 days post discharge and after 30 days discharge?
METHODS
Multiple ML models generated and compared
Individual Classifiers: DT, LOGREG, SVC
Ensemble Classifiers: RF, GBC
GitHub Link