Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Is it possible to get 100% for split validation accuracy ?
Joannach0ng
Member Posts: 7 Learner I
Is it possible to get 100% for split validation accuracy and what are the pros of getting 100% accuracy ?Thank you
Tagged:
0
Answers
In my opinion, most of the time this would be alarming. For some problems it may be possible, and for most real business problems not. A point of reference that might be helpful is to ask, 'If a team of experts were to look closely at the data, how good would they be at making their predictions?' That can sometimes give you an idea for what a good accuracy might be. For some simple problems it may be near or at 100%, for many problems in business it won't be anywhere close.
If you have 100% accuracy, I would check for attributes that are too closely correlated with the outcome; they may contain information that wouldn't be available until after the outcome is observed. There's some more information about correct validation in this course: https://academy.rapidminer.com/learn/course/applications-use-cases-professional/
I'd recommend taking a little time to go through the course. Also, if you have come up with 100% accuracy, are you able to share more about the use-case and data, or the process you are using? We might be able to provide better help.
my 2 cents.
I am taking a risk of being accused by others for teaching you bad things but technically you can achieve it this way, if you train and test model on exactly same dataset:
But still, take other commenters concerns into account, because this thing:
- Makes no sense for and real life / machine learning problems.
- Is a serious mistake from data science point of view.
Are you sure this is exactly the thing you are asked bu the tutor?? If yes, I suggest to study the problem in question and convince your tutor this is a totally wrong thing.Vladimir
http://whatthefraud.wtf
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
One possible exception might be if you have a small number of examples in the test dataset but a large number of attributes in the model, in which case your model can be "over-specified" (basically too many attributes will lead to some unique combination serving as a kind of id to make the predictions). Or if you just have too few examples in the test set altogether (e.g., imagine the reductio of 1 test case, which would then either be 100% accurate or 0%!) this can also happen by random chance.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
The thing is that I had a dataset with some 12 attributes working like this (for the sake of reducing complexity, I'm going to explain with an OR logic gate):
Not the most elegant solution but hell of a win for data science.