Show LOESS in the scatter plot


One of the important purposes of visualizations is estimation of the effect of an attribute to the label. One of the basic visualizations for that purpose is a scatter plot. But sometimes, the effect is difficult to inspect visually. For that purpose other tools like SPSS, SAS or Orange provide ability to add a trendline. Even Excel has this feature.
My proposal:
Add a checkbox into scatter chart that says something like: "Show LOESS".
Reasoning:
- LOESS is a reasonable non-parametric method, which can deal with non-linearities.
- It is already implemented in apache.commons.math3 library, which is already used by RapidMiner.
Implementation details:
- Show confidence intervals, not just the fitted curve, if possible.
- If "Color Column" is defined and is discrete, show a fit for each colour. Something like Mondrian does. Including the transparency of the confidence intervals.
- Possibly respect sample weights, if specified.
Tagged:
3
Comments