Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

need help... can't predict hte other clas...

yogafireyogafire Member Posts: 43 Contributor II
edited November 2018 in Help
hello, I just moved this topic here because someone advises me to post this message here, since it's more suitable. ;D

and now, I'm  having some problems now with my dataset...

My dataset consist of binominal label (yes and no).

I use all kind of tree and all type of validation and attribute selection, but only this appear in my dataset,

image

it means that the model can't predict the other class (yes). the records with class labeled "yes" is only about 20% from the entire dataset (about 5200 out of 27000 overall). so the accuracy seems good (about 80%), but it can be "harakiri" if I apply this model. :'(

what should I do? I desperately need for help.... :-[

Thank you very much for your reply...

Regards,

Dimas Yogatama

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Dimas,
    did you already try to reweight your examples? You might assign a higher weight on the "yes" examples. This is possible by generating a new attribute and assign the role "weight". You can do this either with the Generate Weight (Stratification) if you want to assign both classes an equal value or do it with a process. Here's a small example process for this:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" expanded="true" name="Process">
        <process expanded="true" height="235" width="614">
          <operator activated="true" class="generate_data" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30">
            <parameter key="target_function" value="random classification"/>
          </operator>
          <operator activated="true" class="generate_attributes" expanded="true" height="76" name="Generate Attributes" width="90" x="179" y="30">
            <list key="function_descriptions">
              <parameter key="weight" value="if(label==&quot;positive&quot;,1.5,1)"/>
            </list>
          </operator>
          <operator activated="true" class="set_role" expanded="true" height="76" name="Set Role" width="90" x="313" y="30">
            <parameter key="name" value="weight"/>
            <parameter key="target_role" value="weight"/>
          </operator>
          <connect from_op="Generate Data" from_port="output" to_op="Generate Attributes" to_port="example set input"/>
          <connect from_op="Generate Attributes" from_port="example set output" to_op="Set Role" to_port="example set input"/>
          <connect from_op="Set Role" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    Greetings,
      Sebastian
  • yogafireyogafire Member Posts: 43 Contributor II
    thank you

    I'll try it first
Sign In or Register to comment.