YonesYones Member Posts: 1 Newbie
In Rapid miner, which visualizations allow us to better understand the data?


  • sara20sara20 Member Posts: 110 Unicorn
    edited June 2020


    It depends on your data but when you import your data then you can see the number of each state for each column in RM.

    I hope this helps

  • MarcoBarradasMarcoBarradas Administrator, Employee, RapidMiner Certified Analyst, Member Posts: 272 Unicorn
    @Yones context of the problem is really important to answer your question.

    To understand the distribution of your data data Histograms and Bars are really useful since you'll be able to find outliers and  the distribution of the attribute.

    Scatter plots will let you understand relationships between two attributes (later you'll will validate through correlation matrix and PCA depending on your models if those attributes should or should not be included on the model)

    Lines are really useful for Time Series analysis and trends.

    If you are analyzing patterns on web clicks Sankey   would be useful. 
    Boxplot is another useful graph since on it you could see the quartiles and outliers of an attribute and grouped by other attributes values. 

    In general you should expend some time visualizing your data because by doing it you may get some interesting insights and questions that could later be answered and explored during your ETL process.

    You could find more about this on chapter 3 of this book:
Sign In or Register to comment.