Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

"Clustering files"

hunhun Member Posts: 1 Learner III
edited June 2019 in Help
Hello!

I have a question about clutering. I have some GPS data (the most important are longitude and latitude attribute). I would like to cluster the files, but not in rows.

E.g.: I have 10 csv files (contains lon and lat attributes with 1000-2000 rows). 8 routes are similar (home -> work) and 2 routes are different (home -> shop and work -> restaurant). I would like that: cluster0: the 8 similar routes, and cluster1: the others. Or cluster0: 8 similar, cluster1: other one, cluster2: other one.

Any idea, how can I do that?

Thanks in advance.

Adam
Tagged:

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,531 RM Data Scientist
    Hi Adam,

    i think you can simply calculate statistics per file using loop files and aggregate. Stuff like
    #Entries
    #Most common entry
    ...
    And then use clustering on those 8 examples. At least that would be my first guess.

    ~Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.