[SOLVED] creating very many models using a single data set?

np1234np1234 Member Posts: 2 Contributor I

I'm a new user of RapidMiner and was wondering if someone in the community knows a good way to create what I am trying. 

I am working with a very large dataset (millions of examples) that has 1 id attribute, 1 text attribute, 52 numerical attributes per example (row) and 1 label attribute.  There are about 500 unique text attributes in the whole data set.  What I would like to do is create a decision tree model (and store it) for data corresponding to each unique text attribute.  That is, for each unique text attribute, I want all the examples corresponding to that text attribute and then train a decision tree model using the 52 numerical and 1 label attributes.  I could do it using filter examples, decision tree model, and repository store operators manually for each unique text attribute, but I would have to do this about 500 times.  Is there an efficient way to implement this?  I could try to do this using scripting, but I was just wondering if I could use the built in operators.  Is the Loop operator the answer?

Thanks in advance.


  • David_ADavid_A Administrator, Moderator, Employee, RMResearcher, Member Posts: 297 RM Research

    the Loop Values operator is the one you are looking for.
    The tutorial process shows a very similar example to your problem, using the value of the loop_value macro to filter the examples.
    I hope this answere your questions, if not don't hesitate to ask.

    Best regards,
  • np1234np1234 Member Posts: 2 Contributor I
    Thanks David, I did figure it out a couple weeks ago using just as you said, the operator, Loop Value.  It took me a little bit to how to access the macro, but got it finally.  The curly brackets threw me off.  I thought they were used to imply the macro name, but they actually had to be put in there.
Sign In or Register to comment.