Options

New RapidMiner user seeking advice

KMC_PhDKMC_PhD Member Posts: 5 Contributor II
edited November 2018 in Help
Hello

I am greatly hoping that this isn't a silly question, I am just getting started with RapidMiner.

I have a a spreadsheet which contains 24 rows of information on a student engagement with a virtual learning environment. I want to be able to classify students into certain ability groups based on the data on the spreadsheet, which is complete for each user.

My question is, can i explicitly state i.e. if the student spends more than a certain about of time on the virtual learning environment, accesses a forum, gets more that 60% in a quiz, then classify the student as learner type A.. if so what is the best way of going about this i.e. decision tress / association rules.

Any help is appreciated
 

Answers

  • Options
    MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi,

    first of all, 24 rows, i.e. 24 examples, is a very small learning base for any kind of machine learning algorithm, especially if you have a lot of columns, i.e. attributes. You probably won't be able to automatically train any (good) decision tree based on a data set that small.

    To create a new attribute called Type which is set to certain values based on manual rules, you should have a look at the Generate Attributes operator. There you are able to specify e.g. rules with the syntax
    if(quiz_value > 60 && time_spent > 10, "A", "B")
    which will create a new attribute and set it to A if the condition matches and to B otherwise.

    For an introduction to the general concepts of RapidMiner I'd like to direct you to the video tutorials on our website.

    Best regards,
    Marius
  • Options
    KMC_PhDKMC_PhD Member Posts: 5 Contributor II
    Thank you very much for the reply

    As a correction I am using 24 attributes, currently for testing with aprox 200 rows of data, which will increase to apox 1000 rows..

    Would your solution still be advisable with the increase in data? 
  • Options
    MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    If you want to stick with manual rules, yes. 1000 users should however also be a good basis for automated approaches - you could try a clustering algorithm and then try to describe the clusters to see according to which rules an algorithm would classify your users.

    Best regards,
    Marius
  • Options
    KMC_PhDKMC_PhD Member Posts: 5 Contributor II
    Thanks for the further info.. i think i will stick to the manual rules for now and see how i get on
Sign In or Register to comment.