🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉
Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.
Getting started with predictions of probabilities
I'd like to learn more about datamining/predictions with RM. Therefore I created a little scenario which I want to learn with... and well, I need a some help to get started and to understand working with RM a little better:
I want to predict the outcome of a card game which consists always of 3 Players.
Each player has certain attributes which should have an indication of which player has the best chance to win.
My goal is to predict the probability of each player to win:
Player 1: 10%
Player 2: 50%
Player 3: 40%
As a data basis I have a spreadsheet with training and testing data, including all the games played and the attributes of the players in one row:
P1_Name, P1_Att1, P1_Att2, ..., P3_Name, P3_Att1, P3_Att2, ..., P3_Name, P3_Att1, P3_Att2, ..., OUTCOME of the game(1,2 or 3 wins)
Question 1: How do I declare the attributes right?
So far, I have following understand. The spreadsheets usually consist of following structure (attributes/label):
att1, att2, att3.., outcome (label)
By the naming of the attributes the maschine is able to distinct them in training data as well as in the testing data.
Further, all the attributes combined have an impact on the outcome. The impact may differ by setting/calculating a weight.
This observation brings me to following difficulties/questions in my example:
P1_att1, P2_att1, P3_att1 are the same attributes types and but are seen differently due to the naming. Therefore RM will interprete them differently which can lead to different outcomes if you switch Player1 and Player 2 in one game. So each att1 of the Players should be interpreted as the same regardless of there position. Is it possible to declare that in RM?
All the attributes of each Player should be analyzed individually for each player because my thesis is that only if u consider all the data of one player together, you can make a good estimation of the outcome. Is it also possible to declare which attibutes belong to which player?
Question 2: How can I generate the 3 probabilities of my testing data?
So far, I only get a certain confidence for the player win. Is it possible to determine the probability to win for the second one. (third can be calculated out if first and second)
I really appreciate any help that will get me started.