Options

attribute selection operator

mansour_ebrahimmansour_ebrahim Member Posts: 22 Contributor II
Dear All
I have to make sub-datasets from the main one (I mean selecting subsets of attributes from the original dataset with more than 1000 columns or attributes). I have the name of attributes but cannot past them into the Select Attribute Subset area. Does anybody have any good suggestion to avoid selecting them one by one; doing that is really painful and tedious
Best wishes and regards.
Mansour

Best Answers

  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,508 RM Data Scientist
    Solution Accepted
    attached is one way of filtering "by a list"

    Best,
    Martin

    <?xml version="1.0" encoding="UTF-8"?><process version="9.3.001"><br>  <context><br>    <input/><br>    <output/><br>    <macros/><br>  </context><br>  <operator activated="true" class="process" compatibility="9.3.001" expanded="true" name="Process"><br>    <parameter key="logverbosity" value="init"/><br>    <parameter key="random_seed" value="2001"/><br>    <parameter key="send_mail" value="never"/><br>    <parameter key="notification_email" value=""/><br>    <parameter key="process_duration_for_mail" value="30"/><br>    <parameter key="encoding" value="SYSTEM"/><br>    <process expanded="true"><br>      <operator activated="true" class="utility:create_exampleset" compatibility="9.3.001" expanded="true" height="68" name="Create ExampleSet" width="90" x="246" y="238"><br>        <parameter key="generator_type" value="comma separated text"/><br>        <parameter key="number_of_examples" value="100"/><br>        <parameter key="use_stepsize" value="false"/><br>        <list key="function_descriptions"/><br>        <parameter key="add_id_attribute" value="false"/><br>        <list key="numeric_series_configuration"/><br>        <list key="date_series_configuration"/><br>        <list key="date_series_configuration (interval)"/><br>        <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/><br>        <parameter key="time_zone" value="SYSTEM"/><br>        <parameter key="input_csv_text" value="att&#10;Wind&#10;Temperature"/><br>        <parameter key="column_separator" value=","/><br>        <parameter key="parse_all_as_nominal" value="false"/><br>        <parameter key="decimal_point_character" value="."/><br>        <parameter key="trim_attribute_names" value="true"/><br>      </operator><br>      <operator activated="true" class="generate_attributes" compatibility="9.3.001" expanded="true" height="82" name="Generate Attributes" width="90" x="380" y="238"><br>        <list key="function_descriptions"><br>          <parameter key="Weight" value="1"/><br>        </list><br>        <parameter key="keep_all" value="true"/><br>      </operator><br>      <operator activated="true" class="converters:example_set_2_weights" compatibility="0.5.000" expanded="true" height="82" name="ExampleSet to Weights" width="90" x="514" y="238"><br>        <parameter key="name_attribute" value="att"/><br>        <parameter key="weights_attribute" value="Weight"/><br>      </operator><br>      <operator activated="true" class="retrieve" compatibility="9.3.001" expanded="true" height="68" name="Retrieve Golf" width="90" x="246" y="85"><br>        <parameter key="repository_entry" value="//Samples/data/Golf"/><br>      </operator><br>      <operator activated="true" class="select_by_weights" compatibility="9.3.001" expanded="true" height="103" name="Select by Weights" width="90" x="648" y="136"><br>        <parameter key="weight_relation" value="greater equals"/><br>        <parameter key="weight" value="1.0"/><br>        <parameter key="k" value="10"/><br>        <parameter key="p" value="0.5"/><br>        <parameter key="deselect_unknown" value="true"/><br>        <parameter key="use_absolute_weights" value="true"/><br>      </operator><br>      <connect from_op="Create ExampleSet" from_port="output" to_op="Generate Attributes" to_port="example set input"/><br>      <connect from_op="Generate Attributes" from_port="example set output" to_op="ExampleSet to Weights" to_port="example set"/><br>      <connect from_op="ExampleSet to Weights" from_port="weights" to_op="Select by Weights" to_port="weights"/><br>      <connect from_op="Retrieve Golf" from_port="output" to_op="Select by Weights" to_port="example set input"/><br>      <connect from_op="Select by Weights" from_port="example set output" to_port="result 1"/><br>      <portSpacing port="source_input 1" spacing="0"/><br>      <portSpacing port="sink_result 1" spacing="0"/><br>      <portSpacing port="sink_result 2" spacing="0"/><br>    </process><br>  </operator><br></process><br><br>


    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany

Answers

  • Options
    mansour_ebrahimmansour_ebrahim Member Posts: 22 Contributor II
    Hi Martin,
    Many thanks. But the operator ExampleSet to Weight is not showing in my rapidminer, which version are you currently using?
    Regards.
    Mansour
Sign In or Register to comment.