Exploratory Data Analysis on Cardiovascular Health Dataset

Authors

  • Sayeda Mahak Musawwir
  • Kanishka Gupta
  • Sakshi Gangwar
  • Pratha Saxena
  • Megha Saxena

DOI:

https://doi.org/10.71143/mqff9k76

Abstract

cardiovascular diseases encompass conditions that impact the heart and blood vessels, with symptoms such as fatigue, dizziness, chest pain, discomfort, palpitations, and edema. Three major life-threatening conditions include high blood pressure, high cholesterol, and diabetes, which can lead to a diminished quality of life. The vast amount of healthcare data, or big data, contains valuable insights that can be extracted through Exploratory Data Analysis (EDA) to identify inaccuracies, locate pertinent data, verify assumptions, and assess the degree of association between exploratory factors. This is a crucial tool across industries for uncovering hidden patterns and forecasting future trends. This study examines a refined cardiovascular disease (CVD) dataset to identify clinical and demographic patterns linked to heart disease. The dataset includes 308,854 patients and 23 features, covering demographics (such as sex and age category), clinical variables (e.g., BMI, height, weight), health behaviors (e.g., smoking, exercise), and chronic conditions (e.g., diabetes, heart disease). Descriptive analysis showed that individuals with heart disease had a higher average BMI (29.6 vs. 28.5) and weight (86.9 kg vs. 83.3 kg) compared to those without. About 34% of patients were classified as obese (BMI > 30), indicating a significant at-risk group. Correlation analysis revealed a strong link between weight and BMI, with age showing a modest positive correlation with both BMI and weight. Boxplots indicated that patients with heart disease consistently had higher BMI and more extreme values, suggesting obesity as a major risk factor. K-means clustering analysis identified three distinct subgroups, potentially representing different risk profiles based on age, weight, and BMI. These findings highlight key variables and transformations such as obesity indicators, age-BMI interactions, and cluster memberships for future predictive modeling of cardiovascular risk. Overall, this paper underscores the significance of data analysis in healthcare and its potential to transform the industry

Downloads

Download data is not yet available.

Downloads

Published

29-10-2025

Issue

Section

Articles

How to Cite

Sayeda Mahak Musawwir, Kanishka Gupta, Sakshi Gangwar, Pratha Saxena, & Megha Saxena. (2025). Exploratory Data Analysis on Cardiovascular Health Dataset. International Journal of Research and Review in Applied Science, Humanities, and Technology. https://doi.org/10.71143/mqff9k76