this repository containing analyses and models for predicting heart disease. It includes clustering analyses, classification tree, and logistic regression models applied to dataset USING THE # SPSS.
aimed at predicting heart failure,
-
Descriptive Statistics and Missing Value Analysis:
- Provides descriptive statistics such as mean, minimum, maximum, standard deviation, skewness, and kurtosis for key variables related to heart disease prediction.
- Analyzes missing values and identifies extreme values for each variable.
-
T-Test for Equality of Means:
- Assesses the significant difference in means between two groups for variables such as age and cholesterol level.
- Reports t-test statistics and p-values for each variable.
-
Regression Analysis:
- Evaluates the regression model's performance in predicting age based on several predictors.
- Reports R-squared, adjusted R-squared, standard error of the estimate, and Durbin-Watson statistic.
-
Pearson Correlation Matrix:
- Examines the correlation between various variables, such as age, resting blood pressure, cholesterol, and heart disease.
- Provides insights into the strength and direction of relationships between variables.
-
ANOVA Table:
- Analyzes the differences in means between groups for variables like age, resting blood pressure, cholesterol, and others.
- Reports F-statistics and p-values to determine the significance of group differences.
-
Chi-Square Test:
- Applies Fisher's exact test to determine significant associations between categorical variables.
- Reports the p-value to assess the significance of the association.
-
Factor Analysis:
- Investigates the relationship between variables and identifies underlying factors.
- Reports component loadings for each variable in the dataset.
-
Clustering Analysis (BAVERAGE and QUICK):
- Utilizes two clustering algorithms (BAVERAGE and QUICK) to group similar cases together based on heart disease prediction variables.
- Reports cluster centers, number of cases in each cluster, and ANOVA results.
-
Classification Tree and Logistic Regression:
- Constructs a classification tree and logistic regression model to predict heart disease.
- Reports model summaries, risk estimates, classification accuracy, and coefficients for predictor variables.