July 15, 2024

Health Benefit

Healthy is Rich, Today's Best Investment

Breaking Down the 4 Types of Healthcare Big Data Analytics

9 min read

Big data analytics has become a hot topic as healthcare systems have transitioned to EHRs and begun pursuing digital transformation.

Patients generate massive amounts of data throughout their care journeys that could be used to improve healthcare delivery and outcomes, and big data analytics can help make this information accessible and actionable.

According to IBM, big data refers to data or datasets that are too massive and diverse to be efficiently processed by traditional data analysis software. Data sources are becoming more complex because of the advent of artificial intelligence (AI), the Internet of Things (IoT), and mobile devices.

Big data can be structured, semi-structured, or unstructured. These data are typically characterized by five key qualities: volume, or the quantity of data; variety, or the types of data and their sources; velocity, or how fast the data is flowing; veracity, or the inconsistencies and uncertainties within the data; and value, or how useful the data is.

Microsoft defines big data analytics as “the methods, tools, and applications used to collect, process, and derive insights from varied, high-volume, high-velocity data sets,” which come into play when traditional data analysis tools cannot handle the data’s complexity.

Big data analytics can be categorized into four types: descriptive, diagnostic, predictive, and prescriptive. Each of these can be applied across healthcare settings and use cases. But understanding each one and developing a comprehensive big data analytics strategy that includes them all is necessary before health systems can effectively analyze and gain insights from their data.

Here, HealthITAnalytics will break down the four types of big data analytics leveraged in healthcare, including each one’s purpose and potential applications.


Big data analytics can help support preventive health, find gaps in chronic disease management, improve population health management, and advance precision medicine. But to reap these benefits, healthcare organizations must make their data actionable.

The first step in that process is descriptive analytics, which experts from the Harvard Business School indicate is designed to answer the question, “What happened?”

Descriptive analytics provides stakeholders with a way to find and describe trends in raw data that show what has happened or is currently happening within a business, especially as it relates to key performance indicators (KPIs) or goals.

In healthcare, this type of analytics can help answer surface-level questions about a health system’s operations, revenue, and patients.

For example, a healthcare organization could use descriptive analytics to answer questions about how many clinical staff, such as nurses, there are per patient within each hospital department, whether the organization is meeting its financial goals, or how many patients have been hospitalized within a given week.

Descriptive analytics is one of the most straightforward and accessible of the four types, as all healthcare organizations generate basic patient, financial, and operational data. Because descriptive analytics leverages these raw data, health systems only need to pull the existing data together in a way that provides insights into the metric or goal that these data represent.

Doing so typically doesn’t require healthcare providers to have a deep understanding of advanced analytics methodologies, as they can sufficiently perform and present this type of analytics using percentages and averages. Pie charts, bar graphs, tables, and data dashboards are excellent for highlighting trends in descriptive analytics data, as they are easier to understand than other data visualization techniques.

If one looks at the four types of healthcare big data analytics through the lens of tracking patient hospitalizations, a healthcare organization may decide to track how many of its patients are hospitalized weekly.

To use descriptive analytics, stakeholders would gather hospital admissions data for the time period they are interested in analyzing and define parameters for the analysis, such as deciding that each week starts on Sunday and ends on Saturday. This ensures that the data is consistently measured each week.

These data may then need to be cleaned and normalized, which entails taking the information from where it lives, such as EHRs, billing systems, and admissions records, and removing duplicates or errors. The process may also require removing outliers and unwanted data points.

After identifying and addressing gaps in the data — either by seeking the missing information from other data sources or excluding data points with missing information — the data can be structured, which often involves organizing and formatting the data to make it easier to analyze and visualize.

In the patient hospitalization example, this process would likely entail pulling all admissions data for a given time period, removing any records with duplicate entries, errors, or missing data, and structuring the data so that it could be formatted in terms of the number of individual admissions during each predefined week.


Once a healthcare organization has used descriptive analytics to uncover what is happening, one of the next questions that naturally follows is why that phenomenon is occurring. Diagnostic analytics seeks to provide the answer.

Investigating the causes behind an event is critical in healthcare, as ‘events’ in clinical settings can potentially be life-threatening for patients.

Health systems must also apply diagnostic analytics to understand operational issues, as doing so at the enterprise scale allows health systems to move toward effectively meeting or surpassing their goals, including improving patient outcomes.

By digging deeper and supplementing the findings from an organization’s descriptive analytics, diagnostic analytics help stakeholders flag potential issues negatively impacting KPIs.

The diagnostic analytics process usually involves identifying unexpected changes or anomalies in descriptive analytics data. From there, additional data related to those changes can be investigated and gathered, and statistical methods can help highlight data trends or relationships that may explain the change.

This type of analytics aims to compare data trends to reveal possible correlations and causal relationships between variables.

Going back to the analogy of the health system tracking its weekly hospitalizations, diagnostic analytics could help generate insights into the causes behind the hospitalizations. By comparing trends in the data, stakeholders may find that hospitalizations increase significantly during certain times of year or spike following certain weather events, like a snowstorm or heatwave.

Here, a deeper investigation into the data of the patients hospitalized during these spikes may reveal that car accidents resulting from winter weather or heat-related conditions linked to higher temperatures are to blame for the increase in hospitalizations.

On the other hand, if the data reveal that hospitalizations are staying stable or increasing over long periods, diagnostic analytics is also useful. In this example, looking at the causes of hospital admissions may not translate to a clear-cut answer, as investigating seasonal or weather-related spikes might.

Instead, the process of investigating reasons for admission may show multiple possible causes. In fact, the Centers for Disease Control and Prevention (CDC) indicate that some of the most common diagnoses associated with hospitalizations in the United States are septicemia, heart failure, osteoarthritis, pneumonia, and diabetes mellitus.

Diagnostic analytics may show that many of a health system’s hospitalizations are related to these or other conditions more prevalent within its patient population.

In this example, a hospital may find its admissions increasing predominantly because of heart failure and diabetes. This provides a jumping-off point for the organization to explore potential commonalities between patients in each group, such as reduced medication adherence or unmet social determinants of health (SDOH) needs.

These findings can then be used to move into the next steps of the big data analytics process.


If descriptive analytics shows what has happened and diagnostic analytics explains why it happened, predictive analytics allows health systems to understand what’s likely to happen in the future.

Like the first two types of analytics, predictive analytics also looks to historical data to generate insights. However, it takes the process one step further by using these data to extrapolate and infer future patterns.

In the era of value-based care, many health systems are interested in deploying predictive analytics to help identify which patients may be at risk of certain adverse outcomes in the future.

The health system tracking its hospitalizations may know that heart failure and diabetes are major contributors to admissions, but it can use predictive analytics to identify risk factors that these patients share that may have contributed to their hospitalization.

These risk factors may then predict next week’s hospitalizations for these patient groups, which could help clinical staff adequately prepare for a potential influx.

Thus, the insights from predictive analytics can help further home in on the causes of events that prevent healthcare organizations from achieving their KPIs, such as increased hospitalizations within patient populations.

However, some organizations may not be able to effectively employ predictive analytics. This type of analytics requires pulling large amounts of data from across data sources, which may make it less accessible for healthcare organizations with smaller data repositories or less data analytics expertise.

Predictive analytics is also often enabled through the use of machine learning (ML), which allows stakeholders to analyze vast amounts of data that cannot be effectively processed using traditional methods. But ML is not always accessible to smaller health systems with fewer resources.


The first three types of healthcare data analytics are crucial to supporting healthcare organizations as they work to achieve their goals, but prescriptive analytics is where health systems will see the highest return on investment for their analytics efforts.

Prescriptive analytics takes the information generated by descriptive, diagnostic, and predictive analytics to assist stakeholders in deciding what should be done or changed to improve performance in a given metric. This type of analytics is the most advanced, requiring massive amounts of data and computing power, making it inaccessible to most healthcare organizations today.

However, health systems engaging in predictive analytics are already on their way to being able to pursue prescriptive analytics.

This type of analytics takes the risk factors or predictors produced through predictive analytics and analyzes them to determine what aspects of each factor could be altered to lower the associated risk. This process enables stakeholders to take all factors into account and gain insights into what actions can be taken to prevent multiple possible outcomes.

Prescriptive analytics represents an opportunity for healthcare organizations to make informed, data-driven decisions in the face of uncertainty and deploy strategies to improve KPIs.

To help health systems predict, visualize, and generate strategies to better navigate multiple possible future scenarios or adverse outcomes, prescriptive analytics techniques rely on advanced tools like AI, ML, and quantum computing.

Following the hospitalization tracking example’s big data analytics journey to its natural conclusion, the health system in question has already identified how many hospitalizations are happening each week, flagged that heart disease and diabetes are the primary causes of these admissions, and forecasted future hospitalizations based on risk factors shared by previously hospitalized patients in these populations.

In the prescriptive analytics step, the organization can receive suggestions for potential next steps to prevent these hospitalizations from happening.

If the health system has established that reduced medication adherence or unmet SDOH needs significantly increase the risk for hospitalization in its heart failure and diabetes patients, these suggestions might look like implementing a patient education program on the importance of medication adherence or partnering with a community health organization to develop SDOH strategies and implement effective interventions.

Prescriptive analytics may also highlight weaknesses in a healthcare organization’s chronic disease management strategy and offer evidence-based solutions to fill these gaps, like increased use of digital therapeutics, the implementation of a remote patient monitoring (RPM) program, integration of continuous glucose monitors, and improved care coordination and value-based care approaches.

The potential for using the four types of big data analytics to transform healthcare is significant, but health systems must avoid potential pitfalls, such as a lack of diversity in big data research.

The World Health Organization (WHO) states that big data analytics may provide benefits like improved clinical decision-making, higher quality patient-centered care, enhanced disease threat monitoring, advanced fraud detection, and better waste reduction.

However, the organization also notes that “fragmented or incompatible data, concerns over data security, language barriers, lack of skills to collect and process data, ownership dilemmas, and expenses involved in data storage and transfer are just some of the potential challenges” associated with healthcare big data analytics.

To combat these challenges, the WHO recommends conducting more research to determine best practices, establish reporting and performance metrics, and outline how big data can be appropriately incorporated into daily medical practice.


Leave a Reply

Your email address will not be published. Required fields are marked *