Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Predict missing values
hatsjikidee
Member Posts: 3 Learner I
Hello all,
I have a dataset with about 3000 records of rated songs. About half are rated, the other half is not. I'm trying to build a model that predicts the empty ratings based on what users rated. I have done the following:
My question is, is this correct? Do I need to make adjustments to make it more correct? Because when I for example already change the k I get different values. And another question: how do I show only the values that have been predicted instead of a full overview, including the already filled in values.
Thanks in advance!
I have a dataset with about 3000 records of rated songs. About half are rated, the other half is not. I'm trying to build a model that predicts the empty ratings based on what users rated. I have done the following:
My question is, is this correct? Do I need to make adjustments to make it more correct? Because when I for example already change the k I get different values. And another question: how do I show only the values that have been predicted instead of a full overview, including the already filled in values.
Thanks in advance!
Tagged:
0
Best Answer
-
lionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn@hatsjikidee,
OK, I understand. In theory, your method is the good one....but as you mentionned for each k value, you have different results, but you can not evaluate the "performance" of each prediction.
From my point of view, to create a real recommender model, you need descriptive features of your song(s). For example
you need an associated dataset with for each song, its style (pop, rock etc.), its lenght, its author etc.
Hope this helps,
Regards,
Lionel
PS : There is a useful ressource (a book) for you :
- "RapidMiner, Data mining use cases and business analytics applications", (Chapter 9 : Constructing Recommender Systems in RapidMiner) , from Markus Hofmann and Ralf Klinkenberg.
- the associated extension "Recommenders" (to install from the MarketPlace).8
Answers
If you have some descriptive features of your songs, you can build a model based on your labeled data (your rated songs) and then apply this model to the unlabelled data (the unrated songs).
To help you further can you share your data ?
Hope this helps,
Regards,
Lionel
The dataset has 3 attributes:
Song name - Rating - Name (of the rater)
Every user has about 40 songs, of which 20 rated and 20 not. So the goal is to predict the missing ones based on what the user rated on the ones he did rate. Hope this gives more clarification.
I don't know if you are doing this as part of a class or just for the fun of doing it but in real life part of being a Data Scientist is analysing the problem, identify the data that may or may not predict an outcome and then extract it from it source. Sometimes the Example Set includes all the attributes you may need and sometimes you need to go out to the internet and find it to enhance your analysis.
Hope this helps and if you need help texts us and we'll be glad to guide you on the process.
Best Regards.