Use dates from Data A to indicate events in Data B

SHSguySHSguy Member Posts: 24 Contributor I
edited November 2018 in Help

Hi, 

 

Apologise if this has been discussed as I am sure it has I just got find it :) If so please leave a link to the discussion. 

Would like to use the dates in data A (excel sheet), to highlight the corresponding dates in data B (excel sheet) as to indicate significant events. For the use of a time series graphic. 

 

I have tried the operators Append, Join, and Union numerous times with no luck. 

Each data set when run succeeds and successfully produce individual tables. 

 

Thank you for any assistance on the matter. 

Screen Shot 2018-01-18 at 4.06.07 pm.png

 

 

 

Tagged:

Best Answer

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Solution Accepted

    Hi @SHSguy,

    I believed that you need of Python for an other task but

    you don't need of Python, Anaconda or Execute Script to execute the process (to adapt to your dataset) that I shared. You need to :

     

    1. Active the XML panel : 

    Date_A_B.pnggtgtgtg

     

    2. Copy the code that I shared and paste it in the XML panel : 

    Date_A_B_2.png

    3. Click on the "Check button" : 

    Date_A_B_3.png

    4. Normally, the process appears on the main windows

     

    I hope it will be helpful

     

    Regards, 

     

    Lionel

     

     

     

     

    sgenzer

Answers

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn

    Hi @SHSguy,

     

    What do you mean by "highlight" ?. Can you post an example of what you want to do ? and can you too share your dataset(s), please ?

     

    Regards, 

     

    Lionel

    sgenzer
  • SHSguySHSguy Member Posts: 24 Contributor I

    Hi Lionel, 

     

    Thank you for the reply. I don't have an example but maybe I can explain it better: 

    Data A (Events) states the dates of interest: 

    2014-01-10

    2015-03-11

    2016-05-28

    Data B (Stock prices) provides the stock prices from 2013 - 2017. 

    I am trying to use the dates in Data A (red stars) to highlight/indicate the corresponding dates in Data B. 

    Screen Shot 2018-01-18 at 5.25.08 pm.png

     

     

     

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn

    Hi again @SHSguy,

     

    I think I understand : 

    Does this process answer to your needs ? 

    <?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.0.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="read_excel" compatibility="8.0.001" expanded="true" height="68" name="Read Excel" width="90" x="112" y="34">
    <parameter key="excel_file" value="C:\Users\Lionel\Documents\Formations_DataScience\Rapidminer\Tests_Rapidminer\Date.xlsx"/>
    <parameter key="imported_cell_range" value="A1:A4"/>
    <parameter key="first_row_as_names" value="false"/>
    <list key="annotations">
    <parameter key="0" value="Name"/>
    </list>
    <list key="data_set_meta_data_information">
    <parameter key="0" value="date.true.date_time.attribute"/>
    </list>
    </operator>
    <operator activated="true" class="set_role" compatibility="8.0.001" expanded="true" height="82" name="Set Role" width="90" x="246" y="34">
    <parameter key="attribute_name" value="date"/>
    <parameter key="target_role" value="id"/>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="read_excel" compatibility="8.0.001" expanded="true" height="68" name="Read Excel (2)" width="90" x="112" y="238">
    <parameter key="excel_file" value="C:\Users\Lionel\Documents\Formations_DataScience\Rapidminer\Tests_Rapidminer\stock_prices.xlsx"/>
    <parameter key="imported_cell_range" value="A1:B1827"/>
    <parameter key="first_row_as_names" value="false"/>
    <list key="annotations">
    <parameter key="0" value="Name"/>
    </list>
    <list key="data_set_meta_data_information">
    <parameter key="0" value="date.true.date_time.attribute"/>
    <parameter key="1" value="stock prices.true.integer.attribute"/>
    </list>
    </operator>
    <operator activated="true" class="set_role" compatibility="8.0.001" expanded="true" height="82" name="Set Role (2)" width="90" x="313" y="187">
    <parameter key="attribute_name" value="date"/>
    <parameter key="target_role" value="id"/>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="join" compatibility="8.0.001" expanded="true" height="82" name="Join" width="90" x="447" y="85">
    <parameter key="remove_double_attributes" value="false"/>
    <list key="key_attributes"/>
    </operator>
    <connect from_op="Read Excel" from_port="output" to_op="Set Role" to_port="example set input"/>
    <connect from_op="Set Role" from_port="example set output" to_op="Join" to_port="left"/>
    <connect from_op="Read Excel (2)" from_port="output" to_op="Set Role (2)" to_port="example set input"/>
    <connect from_op="Set Role (2)" from_port="example set output" to_op="Join" to_port="right"/>
    <connect from_op="Join" from_port="join" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

    Here the link to download the "fictive example sets" of the process above : 

    https://drive.google.com/open?id=1gZ3gtvU_E760tjWUpKD75ElLHW06rNxO

     

    Don't hesitate to reply, if it's not what you reseach.

     

    Regards, 

     

    Lionel

    sgenzer
  • SHSguySHSguy Member Posts: 24 Contributor I

    Hi Lionel, 

    Thank you for the information it looks very impressive. Unfortunately, I have no idea how to utilize the code in Rapidminer. I downloaded the Python3 extension to Rapidminer and Pandas onto Python on the Mac (tried searching for it in extensions - it does not seem to be there. 

     

    Cheers,

     

     

    Screen Shot 2018-01-19 at 3.26.14 pm.pngScreen Shot 2018-01-19 at 3.27.08 pm.png

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn

    Hi @SHSguy,

     

    Pandas is a library of Python that you have to install yourself on your mac.

    But to install Python3 and all the associated libraries, I recommend you to install with Anaconda

     

    https://www.anaconda.com/download/#macos

     

    Could you adapt and run the code to get what you want about the stock prices ?

    Tip : you can add a Write Excel operator at the end of process to save your example set as excel file (in order to create your curves etc.)

     

    Regards, 

     

    Lionel

    sgenzer
  • SHSguySHSguy Member Posts: 24 Contributor I

    Hi Lionel, 

     

    Installed both Anaconda and Pandas and checked both are running, unfortunately, I am getting a script error. If possible could you attach a screenshot of the setup in rapidminer of the process? 

     

    Screen Shot 2018-01-20 at 6.02.55 pm.png

     

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    The Execute Script operator is only for use with Groovy script code. Do what @lionelderkrikor says above to load in the XML code. 

    sgenzerSHSguy
  • SHSguySHSguy Member Posts: 24 Contributor I

    Thank you, that helped a lot. I appreciate the effort. 

    sgenzerlionelderkrikor
  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn

    You're welcome @SHSguy

     

    Don't forget to add the Write Excel operator at the end of your process to save your resulting ExampleSet as an Excel file.

    So you you'll be able to build the curves you showed in a previous post.

     

    Regards, 

     

    Lionel

     

    sgenzer
Sign In or Register to comment.