Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Clustering on polynominal & text values

tlindertlinder Member Posts: 1 Learner I
edited October 2019 in Help
Hey together,

I am rather new in the area of data analysis and have a challenge to tackle.
Maybe you guys know how to handle it.
I have a dataset with lots of polynominal and text values. Now I want to create cluster within the data, but for my understanding for example k-means needs kind of distances to create output.
Do you know how to cluster that text values?

Frame: I am a student and have got a example data set from a boat manufacturer with detected quality issues. The target is to find links between issues occured BEFORE the current production phase and the issues in the current production phase (FOCUS, column G).
The links I want to check basically on the localization tags, maybe also on material or item number level. 

Info on the dataset:
Column G and row A are inserted by me for better understanding of the dataset and my aim.


I would be cheered up, if anyone has a hint how to start.

Cheers,
Thomas

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,527 RM Data Scientist
    Hey,

    you do not want to treat this as a clustering problem, but as a prediction (supervised learning) problem.

    I would highly recommend to push this into AutoModel first.

    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.