Newby question

myintuitionmyintuition Member Posts: 3 Contributor I
edited November 2018 in Help

I am sure this is simple but I am stuck.

Hoping someone can point me to a tutorial or even tell me the correct terminology.


I am trying to use RapidMiner to analyze teachers teaching methods over time.

Eg: It is an elementary school with many teachers and many grades.

I want to group the students to a teacher - teaching method and then compare it over time.


So say 30 students in grade 1 were in Mr A's class and learned teaching method A.

Then in grade 2 somewere moved to Mr B's class and learned teaching method B.

Some were moved to Mrs C's class and learned teaching method C.

All the way up to grade 6.

Then of course new students are coming to the school all the time.


My label would be the students Grade per subject.

Looking for a pattern of learning.


How do I group and regroup this so I can compare results over time.

Then I of course want to look at certain students like an ADD student, which path was must successful overtime.


Thanks in advance.

Any help or hints would be appreciated.


  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    Hi there myintuition and welcome to the RapidMiner community!


    Based on your brief description, this could be a somewhat complicated analysis depending on exactly what you are trying to find out.  But the first step is really to define your outcome variable, which in RapidMiner is called the "label".  So you need to have a set of students whose performance has already been evaluated in some way (e.g., perhaps it is their end-of-year academic performance).  Typically you would then decide whether you are trying to predict a continuous outcome (e.g., their final grade on a numerical scale from 0-100) or a categorical outcome (e.g., a binary outcome such as whether they passed and will be promoted to the next grade).


    Once you have defined your outcome, then you can consider the types of modeling algorithms (various operators available in RapidMiner) are best suited to that type of problem.  You can then also construct a suitable series of predictor variables (called "attributes" in RapidMiner) for each student (called "examples" in RapidMiner).  For instance, you might have a set of binary variables for whether each student had a particular teacher or a particular teaching method.  The modeling operators will then utilize the attributes you define to determine what kinds of relationships are present in the data.   


    For your example, I would recommend starting by trying a binary outcome and then looking at a decision tree, which will help you understand some of the relationships that might exist between teaching method and successful outcomes and will also handle categorical and non-linear relationships well.  You can always move onto more advanced techniques later, but it should be a good starting point. 



    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Options
    myintuitionmyintuition Member Posts: 3 Contributor I

    I have already done a decision tree and my label is the grade.

    But so far as far as I can tell it looks at each record individually.

    So each student should have like 60 records (10 classes per year, over 6 years).

    A decision tree tells me what is important but I need to find a pattern.


    I new to DataMining but I am a computer programmer so complexity does not scare me.

    I just need to know how to start.



  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn


    if you are using Data Mining techniques, you will ALWAYS bring your data into the form, that each unit you want to make a decision for, is represented by just one Example (how rows are called in RapidMiner). If you have many grades, you will need to arrange that in one row. May be the simplest solution would be to compute the average grade per student. Complexer solutions are possible, BUT: Usually all students need to have the same information in one row. Missing values will not be permitted by most learning algorithms.


    There's a good comprehensive book out there called 

    Predictive Analytics and Data Mining: Concepts and Practice with RapidMiner

    by Vijay Kotu and Bala Deshpande

    Perhaps worth a look if you look for some guidance how to approach it.




  • Options
    myintuitionmyintuition Member Posts: 3 Contributor I

    I guess what I am looking for is the path that lead to the good grades.

    What type of teaching methods where the students subjected to that led to good  grades

    And when was the path the student look not beneficial.


    The label will always be the grade.


    So I guess my question is how to I compare paths over time (60 records for each student).

    Or I could reduce it per subject so then there would be 6 records per student.



Sign In or Register to comment.