Probability of prediction

MASRAWY2009MASRAWY2009 Member Posts: 2 Contributor I
Hello,
I am trying to use Rapid Miner to predict the enrollment of students at the college. I have a categorical variable (PAID) which has 0 or 1 values. I depend mainly on the probability of each case. However, I noticed that Rapid Miner just predict the value of each case without revealing if the model uses a (CUTOFF value) and if so what is the value of it. Please advice whether Rapid miner can generate the probability of each students to enroll.
Thanks.
Amr

Amr Mohamed
Research Analyst
Enrollment and Communication Dept
Ithaca College

Answers

  • earmijoearmijo Member Posts: 270 Unicorn
    What algorithm are you using to classify the observations?

    (Quick answer : Yes you can control the threshold value above which the observations will be classified as 1s).
  • MASRAWY2009MASRAWY2009 Member Posts: 2 Contributor I
    I am using Neural Networks. Is there any way to compute the prediction probability for each case. Also, can I adjust the classification cutoff value for the training data set ?
    Thanks,
    Amr
  • earmijoearmijo Member Posts: 270 Unicorn
    Ok. You have to use two operators : Create Threshold and Apply Threshold. The inputs for Create Threshold are the threshold and a definition of what the first and second class.

    Assuming you are doing x-validation the process would look like this (with a different dataset obviously)

    Check out this process:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.003">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.2.003" expanded="true" name="Process">
        <process expanded="true" height="296" width="480">
          <operator activated="true" class="retrieve" compatibility="5.2.003" expanded="true" height="60" name="Retrieve" width="90" x="112" y="75">
            <parameter key="repository_entry" value="//Clases/datos/bigeast"/>
          </operator>
          <operator activated="true" class="x_validation" compatibility="5.0.000" expanded="true" height="112" name="Validation" width="90" x="313" y="75">
            <description>A cross-validation evaluating a decision tree model.</description>
            <process expanded="true" height="516" width="232">
              <operator activated="true" class="neural_net" compatibility="5.2.003" expanded="true" height="76" name="Neural Net" width="90" x="112" y="75">
                <list key="hidden_layers"/>
              </operator>
              <connect from_port="training" to_op="Neural Net" to_port="training set"/>
              <connect from_op="Neural Net" from_port="model" to_port="model"/>
              <portSpacing port="source_training" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            </process>
            <process expanded="true" height="516" width="433">
              <operator activated="true" class="apply_model" compatibility="5.0.000" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
                <list key="application_parameters"/>
              </operator>
              <operator activated="true" class="create_threshold" compatibility="5.2.003" expanded="true" height="60" name="Create Threshold" width="90" x="45" y="210">
                <parameter key="first_class" value="Yes"/>
                <parameter key="second_class" value="No"/>
              </operator>
              <operator activated="true" class="apply_threshold" compatibility="5.2.003" expanded="true" height="76" name="Apply Threshold" width="90" x="179" y="210"/>
              <operator activated="true" class="performance" compatibility="5.0.000" expanded="true" height="76" name="Performance" width="90" x="313" y="30"/>
              <connect from_port="model" to_op="Apply Model" to_port="model"/>
              <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
              <connect from_op="Apply Model" from_port="labelled data" to_op="Apply Threshold" to_port="example set"/>
              <connect from_op="Create Threshold" from_port="output" to_op="Apply Threshold" to_port="threshold"/>
              <connect from_op="Apply Threshold" from_port="example set" to_op="Performance" to_port="labelled data"/>
              <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_averagable 1" spacing="0"/>
              <portSpacing port="sink_averagable 2" spacing="0"/>
            </process>
          </operator>
          <connect from_op="Retrieve" from_port="output" to_op="Validation" to_port="training"/>
          <connect from_op="Validation" from_port="model" to_port="result 1"/>
          <connect from_op="Validation" from_port="averagable 1" to_port="result 2"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="0"/>
        </process>
      </operator>
    </process>
  • casteriocasterio Member Posts: 1 Contributor I
    You all have discussed very well on the topic. I am glad to find this forum. Really very helpful. Looking forward to more information. Thanks
Sign In or Register to comment.