Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Boxplot in Visualizations menu is not showing outliers

dannyVdannyV Member Posts: 4 Learner I
edited October 2019 in Help
Hi all,

Apparently when I check the results of a dataset I don't see the outliers (and mean) in the boxplot-plot option. The wiskers go to min and max value (First plot is python boxplot, second is the RM boxplot).


  

So this is actually not a boxplot :).
Currently I have to install the Turkey-extension to have outlier detection for each attribute..
Is it a setting to show outliers?
What can I do about it?
I have installed RM Studio 9.3 and have an Educational License.

Thank you!

Best Answer

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,529 RM Data Scientist
    Hi @dannyV ,
    good catch. According to Wikipedia there are many defitions on this:

    Box and whisker plots quartiles, and the band inside the box is always the second quartile (the median). But the ends of the whiskers can represent several possible alternative values, among them:

    • the minimum and maximum of all of the data[1] (as in figure 2)
    • the lowest datum still within 1.5 IQR of the lower quartile, and the highest datum still within 1.5 IQR of the upper quartile (often called the Tukey boxplot)[2][3][4] (as in figure 3)
    • one standard deviation above and below the mean of the data
    • the 9th percentile and the 91st percentile
    • the 2nd percentile and the 98th percentile.

    Any data not included between the whiskers should be plotted as an outlier with a dot, small circle, or star, but occasionally this is not done.


    I think we just took a different one than python?

    BR,
    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • dannyVdannyV Member Posts: 4 Learner I
    Hi Martin,

    Thank you for the fast and clear response.
    It is a pity that I don't have the possibility in RM to choose.. 
    There is no alternative way on the plotting according the 'second definition' (Turkey) ?

    Kind regards,
    Danny
Sign In or Register to comment.