Decision tree/ What format of excel can be used in RapidMiner?

ranya2670ranya2670 Member Posts: 4 Contributor I
edited December 2018 in Help

Hi

I am a new user. And I am trying to build a decision tree. However, it seems as if 'process' cant retrieve my dataset. And when i 'run the process' there is no decision tree. What am I doing wrong? 

Best,

Ranya

Best Answer

  • ranya2670ranya2670 Member Posts: 4 Contributor I
    Solution Accepted

    <?xml version="1.0" encoding="UTF-8"?><process version="8.0.000">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.0.000" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="8.0.000" expanded="true" height="68" name="Retrieve Soccer" width="90" x="45" y="34">
    <parameter key="repository_entry" value="//Local Repository/data/Soccer"/>
    </operator>
    <operator activated="true" class="set_role" compatibility="8.0.000" expanded="true" height="82" name="Set Role" width="90" x="179" y="34">
    <parameter key="attribute_name" value="Full Time Result"/>
    <parameter key="target_role" value="label"/>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="8.0.000" expanded="true" height="82" name="Select Attributes" width="90" x="313" y="34">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attributes" value="AwayTeam (AT)|Full Time AT Goals|Full Time HT Goals|Half-Time AT Goals|Half-Time HT Goals|Half-Time Result|HomeTeam (HT)|Full Time Result|AT Corners|HT Corners|HT Shots on Target|AT Shots on Target"/>
    </operator>
    <operator activated="true" class="concurrency:parallel_decision_tree" compatibility="8.0.000" expanded="true" height="103" name="Decision Tree" width="90" x="447" y="34"/>
    <connect from_op="Retrieve Soccer" from_port="output" to_op="Set Role" to_port="example set input"/>
    <connect from_op="Set Role" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
    <connect from_op="Select Attributes" from_port="example set output" to_op="Decision Tree" to_port="training set"/>
    <connect from_op="Decision Tree" from_port="model" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <background height="232" location="//Samples/Tutorials/Basics/08/tutorial8" width="1502" x="26" y="47"/>
    </process>
    </operator>
    </process>

Answers

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    hello @ranya2670 - welcome to the community.  We'd be happy to help but could you please post your "Soccer" data set and your XML process here in the thread so we can see what is going on?  Screenshots do not help very much.  Instructions are on the right under "Read Before Posting" when you reply.

     

    Screen Shot 2017-12-05 at 9.59.29 AM.png

     

    Scott

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    Good morning Selina,

     

    This should do the trick.  :)

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.0.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="read_csv" compatibility="8.0.001" expanded="true" height="68" name="Read CSV" width="90" x="45" y="34">
    <parameter key="csv_file" value="/Users/genzerconsulting/Desktop/Data Set Soccer (PL).csv"/>
    <parameter key="date_format" value="dd/MM/yyyy"/>
    <parameter key="first_row_as_names" value="false"/>
    <list key="annotations">
    <parameter key="0" value="Name"/>
    </list>
    <parameter key="encoding" value="UTF-8"/>
    <list key="data_set_meta_data_information">
    <parameter key="0" value="Date.true.date.attribute"/>
    <parameter key="1" value="HomeTeam HT.true.polynominal.attribute"/>
    <parameter key="2" value="AwayTeam AT.true.polynominal.attribute"/>
    <parameter key="3" value="Full Time HT Goals.true.integer.attribute"/>
    <parameter key="4" value="Full Time AT Goals.true.integer.attribute"/>
    <parameter key="5" value="Full Time Result.true.polynominal.attribute"/>
    <parameter key="6" value="Half-Time HT Goals.true.integer.attribute"/>
    <parameter key="7" value="Half-Time AT Goals.true.integer.attribute"/>
    <parameter key="8" value="Half-Time Result.true.polynominal.attribute"/>
    <parameter key="9" value="Referee.true.polynominal.attribute"/>
    <parameter key="10" value="HT Shots.true.integer.attribute"/>
    <parameter key="11" value="AT Shots.true.integer.attribute"/>
    <parameter key="12" value="HT Shots on Target.true.integer.attribute"/>
    <parameter key="13" value="AT Shots on Target.true.integer.attribute"/>
    <parameter key="14" value="HT Fouls committed.true.integer.attribute"/>
    <parameter key="15" value="AT Fouls committed.true.integer.attribute"/>
    <parameter key="16" value="HT Corners.true.integer.attribute"/>
    <parameter key="17" value="AT Corners.true.integer.attribute"/>
    <parameter key="18" value="HT Yellow Cards.true.integer.attribute"/>
    <parameter key="19" value="AT Yellow Cards.true.integer.attribute"/>
    <parameter key="20" value="HT Red Cards.true.integer.attribute"/>
    <parameter key="21" value="AT Red Cards.true.integer.attribute"/>
    </list>
    </operator>
    <operator activated="true" class="set_role" compatibility="8.0.001" expanded="true" height="82" name="Set Role" width="90" x="179" y="34">
    <parameter key="attribute_name" value="Full Time Result"/>
    <parameter key="target_role" value="label"/>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="8.0.001" expanded="true" height="82" name="Select Attributes" width="90" x="313" y="34">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attributes" value="AwayTeam (AT)|Full Time AT Goals|Full Time HT Goals|Half-Time AT Goals|Half-Time HT Goals|Half-Time Result|HomeTeam (HT)|Full Time Result|AT Corners|HT Corners|HT Shots on Target|AT Shots on Target"/>
    </operator>
    <operator activated="true" class="concurrency:parallel_decision_tree" compatibility="8.0.001" expanded="true" height="103" name="Decision Tree" width="90" x="447" y="34"/>
    <connect from_op="Read CSV" from_port="output" to_op="Set Role" to_port="example set input"/>
    <connect from_op="Set Role" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
    <connect from_op="Select Attributes" from_port="example set output" to_op="Decision Tree" to_port="training set"/>
    <connect from_op="Decision Tree" from_port="model" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <background height="232" location="//Samples/Tutorials/Basics/08/tutorial8" width="1502" x="26" y="47"/>
    </process>
    </operator>
    </process>

     

    selina-tree.png

     

    Scott

     

     

  • ranya2670ranya2670 Member Posts: 4 Contributor I

    Hi 

    I am still not able to get a decision tree on my computer with the process you attached. It says 'file not found'. What is the problem?

  • ranya2670ranya2670 Member Posts: 4 Contributor I

    I just cleansed my dataset so I will upload the new one and the code again. Can you then tell me how I can run the process on my computer without error?

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.0.000">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.0.000" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="8.0.000" expanded="true" height="68" name="Retrieve Data Set Soccer (PL) cleansed" width="90" x="45" y="34">
    <parameter key="repository_entry" value="//Local Repository/Data Set Soccer (PL) cleansed"/>
    </operator>
    <operator activated="true" class="set_role" compatibility="8.0.000" expanded="true" height="82" name="Set Role" width="90" x="179" y="34">
    <parameter key="attribute_name" value="Full Time Result"/>
    <parameter key="target_role" value="label"/>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="8.0.000" expanded="true" height="82" name="Select Attributes" width="90" x="313" y="34"/>
    <operator activated="true" class="concurrency:parallel_decision_tree" compatibility="8.0.000" expanded="true" height="103" name="Decision Tree" width="90" x="447" y="34"/>
    <connect from_op="Retrieve Data Set Soccer (PL) cleansed" from_port="output" to_op="Set Role" to_port="example set input"/>
    <connect from_op="Set Role" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
    <connect from_op="Select Attributes" from_port="example set output" to_op="Decision Tree" to_port="training set"/>
    <connect from_op="Decision Tree" from_port="model" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <background height="232" location="//Samples/Tutorials/Basics/08/tutorial8" width="1502" x="26" y="47"/>
    </process>
    </operator>
    </process>

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    I'm not sure what error you are talking about, but it works fine for me.

    DT Works.png

     

     

     

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    hello.  Yes I slightly changed your process so that it would read from the csv instead of from your repository.  So that error message is because the path to the file on my computer is obviously different than it would be on your computer.  You just need to change the filepath in Read CSV.

     

    Skærmbillede 2017-12-07 kl. 16.24.36.png

Sign In or Register to comment.