Options

trend analysis

KooshaKoosha Member Posts: 4 Contributor I
edited November 2018 in Help
Hi Everyone,

I need to search a dataset to find similar trends for a given pattern/trend.
A given pattern is available in a spreadsheet (for example a trend of heart rate values) and a separate spreadsheet represents a history of values for the same attribute (for example heart rate of a patient during 90 days). The goal is to find patterns similar to the given trend.

I am wondering about the process and operators that I can use.


I should mention this is not a prediction problem and a simple process using Euclidean distance (to match the input pattern with window moving along the testset) can do the job.


Thanks,
-- Koosha

Answers

  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    you could window your data using the MultivariateSeries2WindowExamples operator. If you are interested only in the trend and not in the absolute values, you could apply the WindowExamples2ModelingData. Then you could apply ExampleSet2SimilarityExampleSet to get a matrix with all pairwise distances.

    Greetings,
      Sebastian
  • Options
    KooshaKoosha Member Posts: 4 Contributor I
    Hi Sebastian,

    Thanks for your reply.

    I still have problems and I couldn't figure it out from the log message.
    a message box pops up "Process failed! The setup does not seem to contain any obvious errors, but you should check the log..."

    -------------------
    log message:
    -------------------
    G Nov 12, 2009 10:48:33 PM: [Fatal] NullPointerException occured in 1st application of ExampleSet2SimilarityExampleSet (ExampleSet2SimilarityExampleSet)
    G Nov 12, 2009 10:48:33 PM: [Fatal] Process failed: operator cannot be executed. Check the log messages...
              Root[1] (Process)
              +- ExcelExampleSource[1] (ExcelExampleSource)
              +- MultivariateSeries2WindowExamples[1] (MultivariateSeries2WindowExamples)
              +- WindowExamples2ModelingData[1] (WindowExamples2ModelingData)
    here ==> +- ExampleSet2SimilarityExampleSet[1] (ExampleSet2SimilarityExampleSet)

    -----------------------------
    my experiment setup:
    -----------------------------
    <operator name="Root" class="Process" expanded="yes">
        <parameter key="logfile" value="/Users/kg/university/visualization/test_data_KG.log"/>
        <operator name="ExcelExampleSource" class="ExcelExampleSource" breakpoints="after">
            <parameter key="excel_file" value="/Users/kg/university/visualization/testData_KG.xls"/>
            <parameter key="first_row_as_names" value="true"/>
            <parameter key="label_column" value="2"/>
        </operator>
        <operator name="MultivariateSeries2WindowExamples" class="MultivariateSeries2WindowExamples">
        </operator>
        <operator name="WindowExamples2ModelingData" class="WindowExamples2ModelingData">
            <parameter key="label_name_stem" value="HR"/>
        </operator>
        <operator name="ExampleSet2SimilarityExampleSet" class="ExampleSet2SimilarityExampleSet">
            <parameter key="measure_types" value="NumericalMeasures"/>
        </operator>
    </operator>


    -------------------
    input dataset:
    -------------------
    I have a spreadsheet in the following form:
    Day | HR | BPD | BPS
    1    |111 | 72  | 100
    2    | 67 | 108| 130
    3    | 73 | 83  | 134
    4    |120| 73  | 150
    5    | 86 | 65  | 148
    .      |.    |.      |.
    .      |.    |.      |.
    .      |.    |.      |.



    Any help is greatly appreciated.


    Thanks,
    -- Koosha
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    you have to add a IdTagging before the ExampleSet2SimilarityExampleSet operator. Then it will work!
    I will repalce the inconvenient error message at once.

    Greetings,
      Sebastian
  • Options
    KooshaKoosha Member Posts: 4 Contributor I
    Hi Sebastian,

    I still have some problems finding trends similar to an input trend in a given data series. I have explained the problem at  http://www.ualberta.ca/~golmoham/rapid-i/
    Also please let me know if there are other ways to do this.


    Thanks,
    -- Koosha
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    if I understand you correctly, you have exactly one example you are going to match the reference curve? And you want to find out only the distances between this one example and the windowed examples of the curve?
    This cannot be done in an easy way with the core operators, but we could develop a plugin for you, doing exact this. But as all what makes work, this would have to be paid.
    The uneasy but cheaper way would be to use example iterators, Attribute Construction and ExampleFilter to come there, but this process would even take me some time to design, but it's doable. I know, because someone did exactly this in a project.

    Greetings,
      Sebastian
  • Options
    KooshaKoosha Member Posts: 4 Contributor I
    Hi Sebastian,

    In fact the idea is having two files - one file would be the pattern we are looking for, and the other file is the data series (in my current setup I used one file though). I appreciate if you tell me What would be a setup to do this using two files? In other words the problem is for a given trend (for example five data points in a spreadsheet) find all similar patterns in another file with data series (for example daily heart rate of a patient during a period of 200 days).
    I am a PhD student and I would like to use rapidminer for my thesis research project. Unfortunately I cannot afford a custom plugin.


    Thanks,
    -- Koosha
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Koosha,
    I think this is possible, even without the plugin. But unfortunately it's quite complex and I hope that it's understandable that I cannot spend over an hour for building a custom process. If you play around with the operators I mentioned in the last post, you will probably get a feeling for what you need in order to achieve your goal.

    Greetings,
      Sebastian
Sign In or Register to comment.