Remove Binary predictors that have<50 entries in one category
Check Data set
summary(d2)
SwollenLymphNodes ChestCongestion ChillsSweats NasalCongestion Sneeze
No :418 No :323 No :130 No :167 No :339
Yes:312 Yes:407 Yes:600 Yes:563 Yes:391
Fatigue SubjectiveFever Headache Weakness CoughIntensity
No : 64 No :230 No :115 None : 49 None : 47
Yes:666 Yes:500 Yes:615 Mild :223 Mild :154
Moderate:338 Moderate:357
Severe :120 Severe :172
Myalgia RunnyNose AbPain ChestPain Diarrhea EyePn Insomnia
None : 79 No :211 No :639 No :497 No :631 No :617 No :315
Mild :213 Yes:519 Yes: 91 Yes:233 Yes: 99 Yes:113 Yes:415
Moderate:325
Severe :113
ItchyEye Nausea EarPn Hearing Pharyngitis Breathless ToothPn
No :551 No :475 No :568 No :700 No :119 No :436 No :565
Yes:179 Yes:255 Yes:162 Yes: 30 Yes:611 Yes:294 Yes:165
Vision Vomit Wheeze BodyTemp
No :711 No :652 No :510 Min. : 97.20
Yes: 19 Yes: 78 Yes:220 1st Qu.: 98.20
Median : 98.50
Mean : 98.94
3rd Qu.: 99.30
Max. :103.10
We can see that Vision and Hearing have <50 entries for one category. Let’s remove them.
Remove Vision and Hearing
d3<- d2 %>%select(!c(Vision, Hearing))
Now we have a dataframe with 730 observations and 26 variables.