Script task issue with RapidMiner Studio 9.2

olli_aroolli_aro Member Posts: 4 Newbie
edited April 2020 in Help
Hi all,

Has anyone else noticed any issues with using Script Tasks with the latest version of RapidMiner Studio (RapidMiner Studio 9.2.000 (rev:461351, platform: WIN64))?

If I add a script task in my process process, all follow on tasks seem to loose the ExampleSet causing e.g. the available attributes list in "Reorder Attributes" task to be empty. The funny thing is that the follow on tasks still seem to execute ok, if I run the process. For example, if I remove the script task from the process, reorder the attributes with "Reorder Attributes", then put the script task back in and run, the output is reordered as per my configuration.

The above used to work for me with no issue in the previous version of RapidMiner Studio.

Regards,

Olli



Tagged:

Best Answer

Answers

  • hughesfleming68hughesfleming68 Member Posts: 323 Unicorn
    edited February 2019
    I had a similar issue with select attributes yesterday and this does seem to be new to 9.2. Some of my attributes were missing. I had to select the ones I didn't want and then invert. Even though the I could not see the attributes in the operator, the process worked with the new attributes. I will try and reproduce this later today.
  • olli_aroolli_aro Member Posts: 4 Newbie
    edited March 2019
    Thanks for the post back. Maybe a bug then?
  • Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,993 RM Engineering
    edited March 2019
    Hi,

    When you edit your process in the UI and in the parameters of an operator you are selecting attributes, that is what we call "metadata". It's a best effort solution to help create a process. When actually running the process on the real data, this metadata is irrelevant and only the actual data is being looked at.

    We have changed the way this metadata is generated in Studio 9.2, as previously it could freeze your entire Studio UI if you had an operator with large / slow metadata. This was fixed, but as a result some of the metadata may now take a while to appear, or may outright be missing because we overlooked something. Please let us know these instances and possibly share the process with us so we can have a look!

    Regards,
    Marco
  • olli_aroolli_aro Member Posts: 4 Newbie
    Hi Marco,

    Thanks for the message back.

    The metadata just simply goes missing for any data steps following the script step.

    It is really easy to demonstrate. Please see the process below. The Select Attributes prior to the script task can see both Town and District, however the second Select Attributes cannot. If I exclude the script task from the process flow everything works as expected.

    Regards,

    Olli


    <pre class="CodeBlock"><code><?xml version="1.0" encoding="UTF-8"?><process version="9.2.000"><br>&nbsp; <context><br>&nbsp;&nbsp;&nbsp; <input/><br>&nbsp;&nbsp;&nbsp; <output/><br>&nbsp;&nbsp;&nbsp; <macros/><br>&nbsp; </context><br>&nbsp; <operator activated="true" class="process" compatibility="9.2.000" expanded="true" name="Process"><br>&nbsp;&nbsp;&nbsp; <parameter key="logverbosity" value="init"/><br>&nbsp;&nbsp;&nbsp; <parameter key="random_seed" value="2001"/><br>&nbsp;&nbsp;&nbsp; <parameter key="send_mail" value="never"/><br>&nbsp;&nbsp;&nbsp; <parameter key="notification_email" value=""/><br>&nbsp;&nbsp;&nbsp; <parameter key="process_duration_for_mail" value="30"/><br>&nbsp;&nbsp;&nbsp; <parameter key="encoding" value="SYSTEM"/><br>&nbsp;&nbsp;&nbsp; <process expanded="true"><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <operator activated="true" class="read_excel" compatibility="9.2.000" expanded="true" height="68" name="Read Excel" width="90" x="112" y="34"><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="excel_file" value="sample data.xlsx"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="sheet_selection" value="sheet number"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="sheet_number" value="1"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="imported_cell_range" value="A1"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="encoding" value="SYSTEM"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="first_row_as_names" value="true"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <list key="annotations"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="date_format" value=""/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="time_zone" value="SYSTEM"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="locale" value="English (United States)"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="read_all_values_as_polynominal" value="false"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <list key="data_set_meta_data_information"><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="0" value="Town.true.polynominal.attribute"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="1" value="District.true.polynominal.attribute"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </list><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="read_not_matching_values_as_missings" value="false"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="datamanagement" value="double_array"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="data_management" value="auto"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </operator><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <operator activated="true" class="select_attributes" compatibility="9.2.000" expanded="true" height="82" name="Select Attributes (2)" width="90" x="246" y="85"><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="attribute_filter_type" value="subset"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="attribute" value=""/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="attributes" value="District|Town"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="use_except_expression" value="false"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="value_type" value="attribute_value"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="use_value_type_exception" value="false"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="except_value_type" value="time"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="block_type" value="attribute_block"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="use_block_type_exception" value="false"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="except_block_type" value="value_matrix_row_start"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="invert_selection" value="false"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="include_special_attributes" value="false"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </operator><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <operator activated="true" class="execute_script" compatibility="9.2.000" expanded="true" height="82" name="Execute Script" width="90" x="246" y="238"><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="script" value="/* &#10; * You can use both Java and Groovy syntax in this script.&#10; * &#10; * Note that you have access to the following two predefined variables:&#10; * 1) input (an array of all input data)&#10; * 2) operator (the operator instance which is running this script)&#10; */&#10;&#10;// Take first input data and treat it as generic IOObject&#10;// Alternatively, you could treat it as an ExampleSet if it is one:&#10;// ExampleSet inputData = input[0];&#10;IOObject inputData = input[0];&#10;&#10;&#10;// You can add any code here&#10;&#10;&#10;// This line returns the first input as the first output&#10;return inputData;"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="standard_imports" value="true"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </operator><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <operator activated="true" class="select_attributes" compatibility="9.2.000" expanded="true" height="82" name="Select Attributes" width="90" x="447" y="136"><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="attribute_filter_type" value="subset"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="attribute" value=""/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="attributes" value=""/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="use_except_expression" value="false"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="value_type" value="attribute_value"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="use_value_type_exception" value="false"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="except_value_type" value="time"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="block_type" value="attribute_block"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="use_block_type_exception" value="false"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="except_block_type" value="value_matrix_row_start"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="invert_selection" value="false"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="include_special_attributes" value="false"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </operator><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <connect from_port="input 1" to_op="Read Excel" to_port="file"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <connect from_op="Read Excel" from_port="output" to_op="Select Attributes (2)" to_port="example set input"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <connect from_op="Select Attributes (2)" from_port="example set output" to_op="Execute Script" to_port="input 1"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <connect from_op="Execute Script" from_port="output 1" to_op="Select Attributes" to_port="example set input"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <connect from_op="Select Attributes" from_port="example set output" to_port="result 1"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <portSpacing port="source_input 1" spacing="0"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <portSpacing port="source_input 2" spacing="0"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <portSpacing port="sink_result 1" spacing="0"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <portSpacing port="sink_result 2" spacing="0"/><br>&nbsp;&nbsp;&nbsp; </process><br>&nbsp; </operator><br></process>


  • olli_aroolli_aro Member Posts: 4 Newbie
    Hi Marco. This has fixed the issue for me. Thank you so much for your help on this one. Best regards, Olli
Sign In or Register to comment.