adaboost individual model performance

ThiruThiru Member Posts: 100 Guru
edited April 2020 in Help
Im using adaboost  + KNN for my data, which gives performance accuracy of 77.24.  & precision, recall. 
 Adaboost is configured with 10 iterations. 
 is there any way to view the performance of model in each iteration and weights assigned in successive iterations
in rapidminer? 
pl let me know.   thanks

regds
thiru

Best Answer

  • varunm1varunm1 Moderator, Member Posts: 1,207 Unicorn
    edited April 2020 Solution Accepted
    Hello @Thiru

    1. Adaboost will try to improve an algorithm by taking misclassified samples in each iteration to build a classifier. So, this works on training side. The outcome of this training is an ensemble of decision trees, that are applied on testing data to check how well the trained algorithm performed. So Adaboost_1 to 10 are training performances, you can see the trained model is improving based on performances. But testing performance is only 67, which means you still need to tweak parameters or the model is overfitting. 

    2. Yes, you will have 20 if the "mod" port of the validation operator is connected. The reason for this is, the split operator runs the training side two times when the "mod" port of the validation operator is connected to any other operator or result. One time the training side is executed on 70% (In case of 70:30 Split) data (training data) and the other is to train on whole data after validation is complete. In order to avoid this, just remove the connection between "mod" port of the validation operator. If you want to use that, its simple to distinguish, the first 10 performances are related to 70% training data and the 11 to 20 performances are related to whole data.
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

Answers

  • varunm1varunm1 Moderator, Member Posts: 1,207 Unicorn
    Hello @Thiru

    Is this what you are looking for? The image is inside the Adaboost operator, we are calculating Training performance and storing it for each iteration using the "Store" operator. The naming convention used for the store operator is "Adaboost_%{execution_count}". The %{execution_count} macro will help in storing performance at each iteration. I am not sure if we can extract AdaBoost weights.



    Do let us know if this helps
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

  • ThiruThiru Member Posts: 100 Guru
    hello @varunm1,

    thanks for your reply.  could you please elaborate on how to use "store" + "macro"s to get the performance
    during each iteration. Im relatively new to rapidminer.  In the process,  Ive tried set/generate macros operator, but
    it doesnot help.  await your reply. thank you
  • varunm1varunm1 Moderator, Member Posts: 1,207 Unicorn
    Hell@Thiru 

    You don't need to generate a macro. There are predefined macros, in this case I used %{execution_count} macro name in store operator. The reason for this is, the Adaboost iterates 10 times, which means you can get 10 training performances. As you need all the 10 performances, you need to save with a dynamic name that will update after every iteration. So to do this, I used "Adaboost_%{execution_count}" as a name for storing my performance. The %{execution_count} will count the number of times a particular operator executes, as the store operator is located inside AdaBoost, it will iterate 10 times and will name the performance as Adaboost_1, Adaboost_2, Adaboost_3,...

    Please find the attached .rmp file. Import it to RM and check inside Adaboost operator.

    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

  • ThiruThiru Member Posts: 100 Guru
    hello @varunm1,  thanks for your reply.  where I can view the performance of all 10 models.  U mean output of validation operator?   we are getting in case adaboost + decision tree as the case used by you.   If i go for adaboost + KNN - i couldnt view all the 10 models.   could you pl look in to this. thanks

    regds
    thiru
  • varunm1varunm1 Moderator, Member Posts: 1,207 Unicorn
    edited April 2020
    Hello @Thiru

    You can't view them directly, you need to store them first using store operator. That is what I did in the attached process. You need to change the store location as the earlier one is linked to my repository. You need to name the results in store with macro as informed in my earlier post. Once done and run the process, the store operator will store results of adaboost_1, adaboost_2, .... in your repository that you mentioned in store operator 

    Attach store operator as I did. Then point it to a repository location and then give the name as Adaboost_%{execution_count}  then run process and check in that repository location, you will find the results
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

  • ThiruThiru Member Posts: 100 Guru
    hello @varunm1

    thanks for your reply.  I only checked the file sent by you. 
     Ok , I got it.  I retrieved those in store operator through new process and viewed the results. 

    1.  the Adaboost_1 performance shows:  89.04% acc.   adaboost_10 shows: 99.13%.
    But the overall model performance is only:    67.74%. 

     Is it because -the adaboost_1 to adaboost -10 is performed on train data and not test data?  & 67.74% is from test data?

    2.  The file sent by you. shows  the count :   adaboost_1 to Adaboost_20.  whereas  the no. if iterations in adaboost operator
    is mentioned as 10.  How do  we get 20?

    await your reply on the above. thanks

    regds
    thiru


  • ThiruThiru Member Posts: 100 Guru
    thanks . it clarifies.

    regds
    thiru
  • OrliOrli Member Posts: 1 Newbie
    edited March 2022
    Hi @varunm1 @Thiru
    I have the same question as Thiru's second question lastly about 'If the number of iterations in AdaBoost operator is mentioned as 10. How do we get 20 models results?'
    Could you please tell me the reason?
    Thanks!

    Regards,
    Orli
Sign In or Register to comment.