Time: mixed unit (hours & minutes)

Pulpito
Pulpito New Altair Community Member
edited November 2024 in Community Q&A
hi
I am cleaning a dataset before training it.
I have a time attribute which contains data with mixed units in nominal type: e.g. 2 hours, 45 minutes, 1.5 hours.
How can I standardize the unit preferably in numerical form? (e.g. 1.5 hours into simply 90; and 45 minutes into just 45).

Thanks

Tagged:

Answers

  • jwpfau
    jwpfau
    Altair Employee
    edited July 2021
    Hi,

    not beautiful, but something like this would work:

    <?xml version="1.0" encoding="UTF-8"?><process version="9.9.002">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.9.002" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="utility:create_exampleset" compatibility="9.9.002" expanded="true" height="68" name="Create ExampleSet" width="90" x="45" y="34">
            <parameter key="generator_type" value="comma separated text"/>
            <parameter key="number_of_examples" value="100"/>
            <parameter key="use_stepsize" value="false"/>
            <list key="function_descriptions"/>
            <parameter key="add_id_attribute" value="false"/>
            <list key="numeric_series_configuration"/>
            <list key="date_series_configuration"/>
            <list key="date_series_configuration (interval)"/>
            <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="input_csv_text" value="duration&#10;2 hours, 45 minutes&#10;3 hours&#10;3 minutes&#10;1 hour&#10;1 minute&#10;1 hour, 1 minute&#10;0 hours, 5 minutes"/>
            <parameter key="column_separator" value=";"/>
            <parameter key="parse_all_as_nominal" value="false"/>
            <parameter key="decimal_point_character" value="."/>
            <parameter key="trim_attribute_names" value="true"/>
          </operator>
          <operator activated="true" class="generate_attributes" compatibility="9.9.002" expanded="true" height="82" name="Generate Attributes" width="90" x="179" y="34">
            <list key="function_descriptions">
              <parameter key="minutes" value="eval(replaceAll(replaceAll(replaceAll(replaceAll(duration, &quot;,&quot;, &quot;+&quot;), &quot;hour[s]?&quot;, &quot;*60&quot;), &quot;minute[s]?&quot;, &quot;&quot;), &quot; &quot;, &quot;&quot;), REAL)"/>
            </list>
            <parameter key="keep_all" value="true"/>
          </operator>
          <connect from_op="Create ExampleSet" from_port="output" to_op="Generate Attributes" to_port="example set input"/>
          <connect from_op="Generate Attributes" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    

    Greetings,
    Jonas
  • Pulpito
    Pulpito New Altair Community Member
    Thanks Jonas,
    I am impressed with your technical knowledge.

    But, I am a beginner and probably would need a simple solution.
    At this stage of my experience, I still cannot associate any coding with RapidMiner.

    Maybe, I should clean the file outside of RapidMiner.
    Thanks a lot for your detailed solution.

    Best

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.