Work on disease data
My friends used to work on text data before
Now I have a dataset containing 18 features and 106 samples. About the disease. With two classes
There are 79 samples of healthy specimens that do not have a disease. And 25 patients are sick.
And 2 samples are unknown.
I wanted to know if I should do normalization and pre-processing?
Should I do over sampeling, under sampeling?
Is this possible in the rapidminer?
Do you know the typical process for me?
As always, I'm grateful to help you
Everyone's happy day