The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.
Controlling how plots order nominal values
tennenrishin
Member Posts: 177 Contributor II
Hi.
Suppose a user wants to create a "bars stacked" plot, with ordered stacking and grouping categories.
How can this be accomplished reliably? If the outputs from the following process are plotted, either the x-axis is out of order, or the stacking categories are out of order, depending on how the data is sorted. This is the minimal exampleset that induces this problem.
(It is important to have consistently ordered colors/categories when multiple plots need to be compared within/across reports, for example.)
Regards
Isak
Suppose a user wants to create a "bars stacked" plot, with ordered stacking and grouping categories.
How can this be accomplished reliably? If the outputs from the following process are plotted, either the x-axis is out of order, or the stacking categories are out of order, depending on how the data is sorted. This is the minimal exampleset that induces this problem.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>How can it be solved, so that both the stacking order and the x-axis order are correct?
<process version="5.3.005">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.3.005" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="generate_data_user_specification" compatibility="5.3.005" expanded="true" height="60" name="Generate Data by User Specification" width="90" x="45" y="300">
<list key="attribute_values">
<parameter key="StackBy" value=""red""/>
<parameter key="GroupBy" value=""b""/>
<parameter key="Value" value="1"/>
</list>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="generate_data_user_specification" compatibility="5.3.005" expanded="true" height="60" name="Generate Data by User Specification (2)" width="90" x="45" y="210">
<list key="attribute_values">
<parameter key="StackBy" value=""blue""/>
<parameter key="GroupBy" value=""b""/>
<parameter key="Value" value="1"/>
</list>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="generate_data_user_specification" compatibility="5.3.005" expanded="true" height="60" name="Generate Data by User Specification (3)" width="90" x="45" y="390">
<list key="attribute_values">
<parameter key="StackBy" value=""red""/>
<parameter key="GroupBy" value=""a""/>
<parameter key="Value" value="1"/>
</list>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="append" compatibility="5.3.005" expanded="true" height="112" name="Append" width="90" x="313" y="255"/>
<operator activated="true" class="sort" compatibility="5.3.005" expanded="true" height="76" name="sort by GroupBy" width="90" x="514" y="255">
<parameter key="attribute_name" value="GroupBy"/>
</operator>
<operator activated="true" class="sort" compatibility="5.3.005" expanded="true" height="76" name="sort by StackBy" width="90" x="648" y="255">
<parameter key="attribute_name" value="StackBy"/>
</operator>
<connect from_op="Generate Data by User Specification" from_port="output" to_op="Append" to_port="example set 2"/>
<connect from_op="Generate Data by User Specification (2)" from_port="output" to_op="Append" to_port="example set 1"/>
<connect from_op="Generate Data by User Specification (3)" from_port="output" to_op="Append" to_port="example set 3"/>
<connect from_op="Append" from_port="merged set" to_op="sort by GroupBy" to_port="example set input"/>
<connect from_op="sort by GroupBy" from_port="example set output" to_op="sort by StackBy" to_port="example set input"/>
<connect from_op="sort by StackBy" from_port="example set output" to_port="result 1"/>
<connect from_op="sort by StackBy" from_port="original" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
(It is important to have consistently ordered colors/categories when multiple plots need to be compared within/across reports, for example.)
Regards
Isak
0
Answers
with the plot view this is not easily possible because the values are displayed in the order they can be found in the provided example set.
What you can do is using the Advanced Charts. There the values are sorted alphabetically.
Best,
Nils
It needs to go in a report, and the reporting extension doesn't do Advanced Charts AFAIK. So I'm still stuck.
[quote author=Nils]with the plot view this is not easily possible...[/quote]
But possible?
Regards,
Isak
with not easily possible I meant that we need to implement a new feature that allows specifying the ordering of the nominal values ;-)
But this means that currently it is not possible to do this with the normal plot view :-(
Best,
Nils
Like this:
d d d d d d d
d p
d p p p p
d p p
d p p
where rows are (ordered) stacking groups and columns are (ordered) x groups, and p are populated bins and d are dummy examples.
Do you think it could work? I'm not sure yet how to go about it generically.
Regards,
Isak
Then use the numbers for coloring the plot while supplying a chart on how to interprete the colors
Try using the NominaltoNumerical.
A second way might be to to split your dataset into subsets containing only one of the nominal values, generate the same new numerical attribute in each of them and set it accordingly.
Cheers
GzF
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.015">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.3.015" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="generate_data_user_specification" compatibility="5.3.015" expanded="true" height="60" name="Generate Data by User Specification" width="90" x="45" y="120">
<list key="attribute_values">
<parameter key="StackBy" value=""red""/>
<parameter key="GroupBy" value=""b""/>
<parameter key="Value" value="1"/>
</list>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="generate_data_user_specification" compatibility="5.3.015" expanded="true" height="60" name="Generate Data by User Specification (2)" width="90" x="45" y="30">
<list key="attribute_values">
<parameter key="StackBy" value=""blue""/>
<parameter key="GroupBy" value=""b""/>
<parameter key="Value" value="1"/>
</list>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="generate_data_user_specification" compatibility="5.3.015" expanded="true" height="60" name="Generate Data by User Specification (3)" width="90" x="45" y="210">
<list key="attribute_values">
<parameter key="StackBy" value=""red""/>
<parameter key="GroupBy" value=""a""/>
<parameter key="Value" value="1"/>
</list>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="append" compatibility="5.3.015" expanded="true" height="112" name="Append" width="90" x="179" y="120"/>
<operator activated="true" class="sort" compatibility="5.3.015" expanded="true" height="76" name="sort by GroupBy" width="90" x="313" y="120">
<parameter key="attribute_name" value="GroupBy"/>
</operator>
<operator activated="true" class="sort" compatibility="5.3.015" expanded="true" height="76" name="sort by StackBy" width="90" x="380" y="255">
<parameter key="attribute_name" value="StackBy"/>
</operator>
<operator activated="true" class="multiply" compatibility="5.3.015" expanded="true" height="94" name="Multiply" width="90" x="447" y="120"/>
<operator activated="true" class="filter_examples" compatibility="5.3.015" expanded="true" height="76" name="Filter Examples" width="90" x="581" y="30">
<parameter key="condition_class" value="attribute_value_filter"/>
<parameter key="parameter_string" value="GroupBy = a"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="5.3.015" expanded="true" height="76" name="Generate Attributes (2)" width="90" x="715" y="30">
<list key="function_descriptions">
<parameter key="GroupBy2" value="0"/>
</list>
</operator>
<operator activated="true" class="filter_examples" compatibility="5.3.015" expanded="true" height="76" name="Filter Examples (2)" width="90" x="581" y="165">
<parameter key="condition_class" value="attribute_value_filter"/>
<parameter key="parameter_string" value="GroupBy = b"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="5.3.015" expanded="true" height="76" name="Generate Attributes (3)" width="90" x="715" y="165">
<list key="function_descriptions">
<parameter key="GroupBy2" value="1"/>
</list>
</operator>
<operator activated="true" class="union" compatibility="5.3.015" expanded="true" height="76" name="Union" width="90" x="849" y="30"/>
<connect from_op="Generate Data by User Specification" from_port="output" to_op="Append" to_port="example set 2"/>
<connect from_op="Generate Data by User Specification (2)" from_port="output" to_op="Append" to_port="example set 1"/>
<connect from_op="Generate Data by User Specification (3)" from_port="output" to_op="Append" to_port="example set 3"/>
<connect from_op="Append" from_port="merged set" to_op="sort by GroupBy" to_port="example set input"/>
<connect from_op="sort by GroupBy" from_port="example set output" to_op="sort by StackBy" to_port="example set input"/>
<connect from_op="sort by StackBy" from_port="example set output" to_op="Multiply" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_op="Filter Examples" to_port="example set input"/>
<connect from_op="Multiply" from_port="output 2" to_op="Filter Examples (2)" to_port="example set input"/>
<connect from_op="Filter Examples" from_port="example set output" to_op="Generate Attributes (2)" to_port="example set input"/>
<connect from_op="Generate Attributes (2)" from_port="example set output" to_op="Union" to_port="example set 1"/>
<connect from_op="Filter Examples (2)" from_port="example set output" to_op="Generate Attributes (3)" to_port="example set input"/>
<connect from_op="Generate Attributes (3)" from_port="example set output" to_op="Union" to_port="example set 2"/>
<connect from_op="Union" from_port="union" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
CHeers
GzF