🎉 🎉. RAPIDMINER 9.8 IS OUT!!! 🎉 🎉

RapidMiner 9.8 continues to innovate in data science collaboration, connectivity and governance

CLICK HERE TO DOWNLOAD

DTW clustering multiple time series

ZeebzogZeebzog Member Posts: 1 Newbie
edited May 7 in Help
I want to cluster many (thousands) of time series using DTW or similar. 
I see K-Medoids in RapidMiner but I can’t figure out how to pass it multiple time series. I have a single attribute which measures the size of a particular event. None of the time series are the same length, none of the time series cover exactly the same period (some last minutes some cover  years), however the type of event is the same and all exist within a 2.5 year timeframe. I want to cluster the time series by how similar the series are. 

Answers

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,543   Unicorn
    What is your definition of similar here, if the dates are not aligned?  Recall that any clustering algorithm is going to use some kind of distance metric by evaluating different examples within the same attribute, and then looking at all distances across all attributes.  So in order for your approach to work I think you are going to have to come up with some kind of harmonized date series and then interpolate or aggregate as needed (which can all be done within RapidMiner using the Time Series operators). 
    Then you probably are also going to want to normalize given that many distance metrics are sensitive to differences in absolute magnitude from one attribute to another.
     
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.