Handling durations of format "hh:mm:ss" (sum of time attributes)

aceaacea Member Posts: 2 Contributor I
I have a CSV data set with time attributes ("hh:mm:ss") in it, eg. attribute time1= "01:23:45" and time2 = "01:01:01"
I want to use GenerateAttribute (or some other Operator/Process) to calculate e.g. SumOfTime1AndTime2 (sholud be 02:24:46 in the example ).
How to to that?

I found that I can't simply add attributes of type time in GenerateAttribute. Then I have tried out several operator chains (NominalToDate (time), DateToNumeric and than GenerateAttribute and back to date..).
But nothing seems to work, because time seems to be handled internally as a date value.
Any ideas?

Here is an example which does not work:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.013">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.3.013" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="generate_data_user_specification" compatibility="5.3.013" expanded="true" height="60" name="Generate Data by User Specification" width="90" x="45" y="120">
        <list key="attribute_values">
          <parameter key="time1" value="&quot;02:34:56&quot;"/>
          <parameter key="time2" value="&quot;01:01:01&quot;"/>
        </list>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="nominal_to_date" compatibility="5.3.013" expanded="true" height="76" name="Nominal to Date" width="90" x="179" y="120">
        <parameter key="attribute_name" value="time1"/>
        <parameter key="date_type" value="time"/>
        <parameter key="date_format" value="hh:mm:ss"/>
      </operator>
      <operator activated="true" class="nominal_to_date" compatibility="5.3.013" expanded="true" height="76" name="Nominal to Date (2)" width="90" x="313" y="120">
        <parameter key="attribute_name" value="time2"/>
        <parameter key="date_type" value="time"/>
        <parameter key="date_format" value="hh:mm:ss"/>
      </operator>
      <operator activated="true" class="date_to_numerical" compatibility="5.3.013" expanded="true" height="76" name="Date to Numerical" width="90" x="447" y="120">
        <parameter key="attribute_name" value="time1"/>
        <parameter key="millisecond_relative_to" value="epoch"/>
        <parameter key="keep_old_attribute" value="true"/>
      </operator>
      <operator activated="true" class="date_to_numerical" compatibility="5.3.013" expanded="true" height="76" name="Date to Numerical (2)" width="90" x="581" y="120">
        <parameter key="attribute_name" value="time2"/>
        <parameter key="millisecond_relative_to" value="epoch"/>
        <parameter key="keep_old_attribute" value="true"/>
      </operator>
      <operator activated="true" class="generate_attributes" compatibility="5.3.013" expanded="true" height="76" name="Generate Attributes" width="90" x="45" y="255">
        <list key="function_descriptions">
          <parameter key="SumOfBoth" value="time1_millisecond+time2_millisecond"/>
        </list>
      </operator>
      <operator activated="true" class="numerical_to_date" compatibility="5.3.013" expanded="true" height="76" name="Numerical to Date" width="90" x="179" y="255">
        <parameter key="attribute_name" value="SumOfBoth"/>
      </operator>
      <connect from_op="Generate Data by User Specification" from_port="output" to_op="Nominal to Date" to_port="example set input"/>
      <connect from_op="Nominal to Date" from_port="example set output" to_op="Nominal to Date (2)" to_port="example set input"/>
      <connect from_op="Nominal to Date (2)" from_port="example set output" to_op="Date to Numerical" to_port="example set input"/>
      <connect from_op="Date to Numerical" from_port="example set output" to_op="Date to Numerical (2)" to_port="example set input"/>
      <connect from_op="Date to Numerical (2)" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
      <connect from_op="Generate Attributes" from_port="example set output" to_op="Numerical to Date" to_port="example set input"/>
      <connect from_op="Numerical to Date" from_port="example set output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>
[ /code]

Answers

  • aceaacea Member Posts: 2 Contributor I
    Allright, it seems there is no better answer than the one I "found": preprocessing the CSV data in Excel manually.
    It would be nice though if future versions of Rapidminer would support simple arithmetics for time values.
  • awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
    Hello

    Within the Generate Attributes operator, there are many date and time functions that can be used. You could use date_parse to convert a time in milliseconds to a date and time - there's an example here http://rapidminernotes.blogspot.co.uk/2011/02/converting-unix-timestamps-in.html.

    For the process you provided, you also need to change the data type to be date_time in the Nominal to Date operators - there seems to be an oddness (dare I say, a feature) in this function when using time as the type that adds an extra month to the unix timestamp.

    regards

    Andrew
Sign In or Register to comment.