Clustering on polynominal & text values

tlindertlinder Member Posts: 1 Newbie
edited October 2019 in Help
Hey together,

I am rather new in the area of data analysis and have a challenge to tackle.
Maybe you guys know how to handle it.
I have a dataset with lots of polynominal and text values. Now I want to create cluster within the data, but for my understanding for example k-means needs kind of distances to create output.
Do you know how to cluster that text values?

Frame: I am a student and have got a example data set from a boat manufacturer with detected quality issues. The target is to find links between issues occured BEFORE the current production phase and the issues in the current production phase (FOCUS, column G).
The links I want to check basically on the localization tags, maybe also on material or item number level. 

Info on the dataset:
Column G and row A are inserted by me for better understanding of the dataset and my aim.

I would be cheered up, if anyone has a hint how to start.



  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,507 RM Data Scientist

    you do not want to treat this as a clustering problem, but as a prediction (supervised learning) problem.

    I would highly recommend to push this into AutoModel first.

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.