🎉 🎉. RAPIDMINER 9.8 IS OUT!!! 🎉 🎉

RapidMiner 9.8 continues to innovate in data science collaboration, connectivity and governance

CLICK HERE TO DOWNLOAD

"Problems with Tutorial Step 6 of 26"

Mark_KnechtMark_Knecht Member Posts: 10 Contributor II
edited June 2019 in Help
I'm trying to go through the tutorial this morning and having some trouble with the instructions on Step 6 of 26. Here is the instruction I'm having trouble with:

Try the following:
Select the Input operator. The property table on the right side shows the parameters of this operator. Press the "Edit" button of the "attributes" parameter. The attribute editor shows a sample of the data. Please note the question marks which represents unknown data. Close the attribute editor.

What 'input operator' are they talking about here and how do I get to the Attribute Editor?

If I click on the Preprocessing block then I see 'attribute filter type'  but I don't think this is what I'm being asked to do.

I think if I could find a help file for RM5 I might be in better shape but am I correct that there isn't one?

Thank,
Mark


<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" expanded="true" name="Root">
    <description>&lt;p&gt; Usually much time of data mining is spent to preprocess the data. RapidMiner offers several operators to read your data from many different sources and also operators to process your data and ease learning. &lt;/p&gt; &lt;p&gt; In many applications the data contains missing values. One of the available preprocessing operators replaces them with the average / min / max of the attribute. Other operators can also handle infinite values. &lt;/p&gt; &lt;p&gt; Try the following: &lt;ul&gt; &lt;li&gt;Select the Input operator. The property table on the right side shows the parameters of this operator. Press the &amp;quot;Edit&amp;quot; button of the &amp;quot;attributes&amp;quot; parameter. The attribute editor shows a sample of the data. Please note the question marks which represents unknown data. Close the attribute editor. By the way, the attribute editor can also be used to create attribute description files (.aml) for data sets.&lt;/li&gt; &lt;li&gt;Use a breakpoint after the Input operator and run the process. Compare the data before and after the preprocessing.&lt;/li&gt; &lt;li&gt;The Output operator writes the data back into a file. You can look into this file with an arbitrary text editor. Please refer to the RapidMiner Tutorial for further information about using the ExampleSetWriter.&lt;/li&gt; &lt;/ul&gt; &lt;/p&gt;</description>
    <process expanded="true" height="584" width="300">
      <operator activated="true" class="retrieve" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
        <parameter key="repository_entry" value="../../data/Labor-Negotiations"/>
      </operator>
      <operator activated="true" class="replace_missing_values" expanded="true" height="94" name="Preprocessing" width="90" x="180" y="30">
        <parameter key="attribute" value="shift-differential"/>
        <parameter key="value_type" value="numeric"/>
        <list key="columns">
          <parameter key="wage-inc-1st" value="minimum"/>
          <parameter key="wage-inc-3rd" value="maximum"/>
        </list>
      </operator>
      <connect from_op="Retrieve" from_port="output" to_op="Preprocessing" to_port="example set input"/>
      <connect from_op="Preprocessing" from_port="example set output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>


Answers

  • Mark_KnechtMark_Knecht Member Posts: 10 Contributor II
    If someone could confirm, maybe this means:

    1) Select Preprocessing
    2) Put RM in Expert Mode
    3) look for the expert parameter 'columns'
    4) Select 'Edit List(2)

    ??????

    Note that the word 'attributes' appears nowhere on the right hand side.

    I still think I've got this wrong.

    - Mark
  • SebastianLohSebastianLoh Member Posts: 99  Maven
    Hi Mark,

    indeed, the tutorial is confusing and out dated. Below you find the updated description for tutorial step 6. A geman online manual will be published very soon and the english translation is in progess. This froum should keep you up to date when it will be published.


    Ciao Sebastian


    Now the description:


    Usually much time of data mining is spent to preprocess the data. RapidMiner offers several operators to read your data from many different sources and also operators to process your data and ease learning.
    In many applications the data contains missing values. One of the available preprocessing operators replaces them with the average / min / max of the attribute. Other operators can also handle infinite values.

    Try the following:

    Select the "Retrieve" operator. The "Parameters" tab on the right side shows the parameters of this operator. The "Retrieve" operator only has the repository entry parameter. If you pres F7 or do an right click in the "Retrieve" operator, you can set a break point, ie. the process execution will stop after this operator. If you now run the process by pressing the "Play" (F11) button the process starts and yields after the breakpoint in the "Retrieve" operator. Now RapidMiner displays the output of the "Retrieve" operator in the ExampleSet (Retrieve) tab. The column "Missings" indicates the number of missing values for a attribute, eg. the pension attribute has 22 missing values. Switch from the "Meta Data View" to the "Data View" to take a look at the actual missing values. In the data table you can find some question marks, which indicate a missing value for one sample (row). The "View Filter" in the upper right corner of the tab allows you to filter your data set by certain criteria. Try out some filters so see which examples are complete and which do have missing values.

    Now go switch back to the Design perspective (Menu bar: View/Perspectives/Design). In order to replace missing values in the data we choose the "Prepocessing (Replace Missing Values)" operator. You can enable the Expert Mode by pressing F4. By selecting the prepocessing operator you can set up its parameters in the parameters tab on the right side. The parameter "attribute filter type" determines the attributes which the preprocessor is applied to. The parameter "default" determines the value a missing value is replaced with. You can select various options, eg. the average value of the attribute. You can concatenate various preprocessing operators in order to replace different attributes with different types of default values.
  • Mark_KnechtMark_Knecht Member Posts: 10 Contributor II
    Thanks Sebastian. It's good to know documentation is coming. I'm sure a lot of my questions will be answered there.

    Cheers,
    Mark
Sign In or Register to comment.