Health professionals and researchers have access to plenty of healthcare data. However, the implementation of artificial intelligence (AI) technology in healthcare is very limited, primarily due to lack of awareness about AI. AI is still a problem for most healthcare professionals. The purpose of this article is to introduce AI to the healthcare professional, and its application to different types of healthcare data.
IT (information technology) professionals such as data scientists, AI developers, and data engineers are also facing challenges in the healthcare domain; for example, finding the right problem,1 lack of data availability for training of AI models, and various issues with the validation of AI models. This article highlights the various potential areas of healthcare where IT professionals can collaborate with healthcare experts to build teams of doctors, scientists, and developers, and translate ideas into healthcare products and services.
Intel provides educational software and hardware support to health professionals, data scientists, and AI developers. Based on the dataset type, we highlighted a few use cases in the healthcare domain wheref AI was applied using various medical datasets.
AI is an intelligent technique that enables computers to mimic human behavior. AI in healthcare uses algorithms and software analyzing of complex medical data to find the relationships between patient outcomes and prevention/treatment techniques.2 Machine learning (ML) is a subset of AI. It uses various statistical methods and algorithms, and enables a machine to improve with experience. Deep learning (DL) is subset of ML.3 It takes machine learning to the next level with multilayer neural network architecture. It indentifes a pattern or does other complex tasks like the human brain does. DL has been applied in many fields such as computer vision, speech recognition, natural language processing (NLP), object detection, and audio recognition.4 Deep neural networks (DNNs) and recurrent neural networks (RNNs), examples of deep learning architectures, are utilized in improving drug discovery and disease diagnosis.5
Figure 1. Relationship of artificial intelligence, machine learning, and deep learning
According to Frost & Sullivan (a growth partnership company), the AI market in healthcare may reach USD 6.6 billion by 2021, a 40 percent growth rate. AI has the potential to reduce the cost of treatment by up to 50 percent.6 AI applications in healthcare may generate USD 150 billion in annual savings by 2026, according to the Accenture analysis. AI-based smart workforce, culture, and solutions are consistently evolving to provide comfort to the healthcare industry in multiple ways, such as 7
Hospitals, clinics, and medical and research institutes generate a large volume of data on a daily basis, which includes lab reports, imaging data, pathology reports, diagnostic reports, and drug information. Such data is expected to increase greatly in the next few years when people expand their use of smartphones, tablets, the IoT (Internet of things), and Fitness Gazette to generate information.8 Digital data is expected to reach 44 zettabytes by 2020, doubling every year.9 The rapid expansion of healthcare data is one of the greatest challenges for clinicians and physicians. Current literature suggests that big data ecosystem and AI are solutions to processing this massive data explosion along with meeting the social, financial, and technological demands of healthcare. Analysis of such big and complicated data is often difficult and it requires a high level of skill for data analysis. Moreover, the most challenging part is an interpretation of results and recommendations based on the outcome, and medical experience, and requires many years of medical involvement, knowledge, and specialized skill sets.
In healthcare the data are generated, collected, and stored in multiple formats including numerical, text, images, scans, and audios or videos. If we want to apply AI to our dataset, we first need to understand the nature of the data, and all questions that we want to answer from the target dataset. Data type helps us to formulate the neural network, algorithm, and architecture for AI modeling. Here, we introduce a few AI-based cases as examples to demonstrate the application of AI in healthcare, in general. Typically, it can be customized accordingly, based on the project and area of interest (that is, oncology, cardiology, pharmacology, internet medicine, primary care, urgent care, emergency, and radiology). Below is a list of AI applications based on the format of various datasets that are gaining momentum in the real world.
One of the most popular ways to generate data in healthcare is with images such as scan (PET Scan image with credit Susan Landau and William Jagust at UC Berkeley)10, tissue section11, drawing12, organ image13 (Figure 2A). In this scenario, specialists look for particular features in an image. A pathologist collects such images under the microscope from tissue sections (fat, muscle, bone, brain, liver biopsy, and so on). Recently, Kaggle organized the Intel and MobileODT Cervical Cancer Screening Competition to improve the precision and accuracy of cervical cancer screening using a big image data set (training, testing, and additional data set).14 The participants used different deep learning models such as the faster region-based convolution neural network (R-CNN) detection framework with VGG16,15 supervised semantics-preserving deep hashing (SSDH) (Figure 2B), and U-Net for convolutional networks.16 Dr Silva achieved 81 percent accuracy using Caffe* on GoogLeNet* for the validation test.16
Similarly, Xu et al. investigated datasets of over 7,000 images of single red blood cells (RBCs) from eight patients with sickle cell disease. They selected the DNN classifier to classify the different RBC types.17 Gulshan et al. applied deep convolutional neural network (DCNN) in more than 10,000 retinal images collected from 874 patients to detect moderate and worse referable with about 90 percent sensitivity and specificity.18
Figure 2. A) Various types of healthcare image data. B) Supervised semantics-preserving deep hashing (SSDH), a deep learning model, used in the Intel and MobileODT Cervical Cancer Screening Competition for image classification. Source: 10-13,16.
Positron Emission Tomography (PET), computed tomography (CT), magnetic resonance imaging (MRI), and ultrasound images (Figure 2A) are another source of healthcare data where images of tissue inside are collected from internal organ (like brain, tumors) without invasion. Deep learning models can be used to measure the tumor growth over time in cancer patients on medication. Jaeger et al. applied convolutional neural network (CNN) architecture on a diffusion-weighted MRI. Based on an estimation of the properties of the tumor tissue, this architecture reduced false-positive findings, and thereby decreased the number of unnecessary invasive biopsies. The researchers noticed that deep learning reduced the motion and vision error, and thus provided more stable results in comparison to manual segmentation.19 A study conducted in China showed that deep learning helped to achieve 93 percent accuracy in distinguishing malignant and benign cancer on the elastogram of ultrasound shear-wave elastography of 200 patients.20,21
Figure 3. Example of numerical data
Healthcare industries collect a lot of patient/research-related information such as age, height, weight, blood profile, lipid profile, sugar, blood pressure, and heart rate. Similarly, gene expression data (for example, fold change) and metabolic information (for example, level of metabolites) are also expressed by the numbers.
The literature showed several cases where the neural network was successfully applied in healthcare. For instance, Danaee and Ghaeini from Oregon State University (2017) used a deep architecture, stacked denoising autoencoder (SDAE) model, for the extraction of meaningful features from gene expression data of 1097 breast cancer and 113 healthy samples. This model enables the classiﬁcation of breast cancer cells and identification of genes useful for cancer prediction (as biomarkers) or as the potential for therapeutic targets.22 Kaggle shared the breast cancer dataset from the University of Wisconsin containing formation radius, texture, perimeter, area, smoothness, compactness, concavity, symmetry, and fractal dimension of the cancer cell nucleus. In the Kaggle competition, the participants had successfully built a DNN classifier to predict breast cancer type (malignant or benign). 23
Figure 4. Example of textual data
Plenty of medical information is recorded as text; for instance, clinical data (cough, vomiting, drowsiness, and diagnosis), social, economic, and behavioral data (such as poor, rich, depressed, happy), social media reviews (Twitter*, Facebook*, Telegram*, and so on), and drug history. NLP, a type of neural network, translates free text into standardized data. It enhances the completeness and accuracy of electronic health records (EHRs). NLP algorithms extract risk factors from notes available on the EHR.
For example, NLP was applied on 21 million medical records. It identified 8500 patients who were at risk of developing congestive heart failure with 85 percent accuracy.24 The Department of Veterans Affairs used NLP techniques to review more than two billion EHR documents for indications of post-traumatic stress disorder (PTSD), depression, and potential self-harm in veteran patients.25 Similarly, NLP was used to identify psychosis with 100 percent accuracy on schizophrenic patients based on speech patterns.26 IBM Watson* analyzed 140,000 academic articles, which cannot be read, understood, or remembered by humans, and suggested recommendations about a course of therapy for cancer patients.24
Figure 5. Examples of electrogram data. Source:27,31
Figure 6. Architecture of deep learning with convolutional neural network model useful in classification of EEG data. (Source: 28-29)
Electrocardiogram (ECG)27, electroencephalogram (EEG), electrooculogram (EOG), electromyogram (EMG), and sleep test are some examples of graphical healthcare data. Electrogram is the process of recording the electrical activity of the target organ (such as heart, brain, and muscle) over a period of time using electrodes placed on the skin.
Schirrmeister et al. from the University of Freiburg designed and trained a deep ConvNets (deep learning with convolutional network) model to decode raw EEG data, which is useful for EEG-based brain mapping.28,29 Paurbabaee et al. from Concordia University, Canada used a large volume of raw ECG time-series data and built a DCNN model. Interestingly, this model learned key features of the paroxysmal atrial fibrillation (PAF)—a life-threatening heart disease, and was thereby useful in PAF patient screening. This method can be a good alternative to traditional ad hoc and time-consuming user's handcrafted features.30 Sleep stage classification is an important preliminary exam of sleep disorders. Using 61 polysomnography (PSG) time series data, Chambon et al. built a deep learning model for classification of sleep stage. The model showed a better performance, relative to traditional method, with little run time and computational cost.31
Figure 7. Example of audio data.
Sound event detection (SED) deals with detection of the onset and offset times for each sound event in an audio recording and associates a textual descriptor. SED has been drawing great interest recently in the healthcare domain for healthcare monitoring. Cakir et al. combined CNNs and RNNs in a convolutional recurrent neural network (CRNN) and applied it to a polyphonic sound event detection task. They observed a considerable improvement in the CRNN model.32
Videos are a sequence of images; in some cases they can be considered as a time series, and in very particular cases as dynamical systems. Deep learning techniques helps researchers in both computer vision and multimedia communities to boost the performance of video analysis significantly and initiate new research directions to analyze video content. Microsoft started a research project called InnerEye* that uses machine learning technology to build innovative tools for the automatic, quantitative analysis of three-dimensional radiological images. Project InnerEye employs algorithms such as deep decision forests as well as CNNs for the automatic, voxel-wise analysis of medical images.33 Khorrami et al. built a model on videos from the Audio/Visual Emotion Challenge (AVEC 2015) using both RNNs and CNNs, and performed emotion recognition on video data.34
Figure 8. Molecular structure of 4CDG (Source: rcbs.org)
Figure 8 shows a typical example of the molecular structure of one drug molecule. Generally, the design of a new molecule is associated with the historical dataset of old molecules. In quantitative structure-activity relationship (QSAR) analysis, scientists try to find known and novel patterns between structures and activity. At the Merck Research Laboratory, Ma et al. used a dataset of thousands of compounds (about 5000), and built a model based on the architecture of DNNs (deep neural nets).35 In another QSAR study, Dahl et al. built neural network models on 19 datasets of 2,000‒14,000 compounds to predict the activity of new compounds.36 Aliper and colleagues built a deep neural network–support vector machine (DNN–SVM) model that was trained on a large transcriptional response dataset and classified various drugs into therapeutic categories.37 Tavanaei developed a convolutional neural network model to classify tumor suppression genes and proto-oncogenes with 82.57 percent accuracy. This model was trained on tertiary structures proteins obtained from protein data bank.38 AtomNet* is the first structure-based DCNN. It incorporates structural target information and consequently predicts the bioactivity of small molecules. This application worked successfully to predict new, active molecules for targets with no previously known modulators.39
Here are a few practical examples where AI developers, startups, and institutes are building and testing AI models:
Deep learning has great potential to help medical and paramedical practitioners by:
The examination of thousands of images is complex, time consuming, and labor intensive. How can AI help?
A team from Harvard Medical School’s Beth Israel Deaconess Medical Center noticed a 2.9 percent error rate with the AI model and a 3.5 percent error rate with pathologists for breast cancer diagnosis. Interestingly, the pairing of “deep learning with pathologist” showed a 0.5 percent error rate, which is an 85 percent drop.40 Litjens et al. suggest that deep learning holds great promise in improving the efficacy of prostate cancer diagnosis and breast cancer staging. 41,42
Intel provides educational software and hardware support to health professionals, data scientist and AI developers, and makes available free AI training and tools through the Intel® AI Developer Program.
Intel recently published a series of AI hands-on tutorials, walking through the process of AI project development, step-by-step. Here you will learn:
Intel is committed to providing a solution for your healthcare project. Please read the article on the Intel AI Developer Program to learn more about solutions using Intel® architecture (Intel® Processors for Deep Learning Training). In the next article, we explore examples of healthcare datasets where you will learn how to apply deep learning. Intel is committed to help you to achieve your project goals.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804