"Plot view - bug?"

xkubaxkuba Member Posts: 3 Contributor I
edited May 23 in Help
Hi,

I'm trying to analyze input data - their attributes and relations between them. I think I came accross small bug. Steps to reproduce:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.0.10" expanded="true" name="Process">
    <process expanded="true" height="482" width="712">
      <operator activated="true" class="generate_direct_mailing_data" compatibility="5.0.10" expanded="true" height="60" name="DirectMailingExampleSetGenerator (2)" width="90" x="45" y="30">
        <parameter key="number_examples" value="1000"/>
      </operator>
      <connect from_op="DirectMailingExampleSetGenerator (2)" from_port="output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

Then go to Plot View and select Plotter: Distribution, Class column: lifestyle, Plot column: car. And I can see this:

http://et5.comgate.cz/kuba/rm.png

It seems to me that selected values in "Plot Column" combo are one off with the displayed values in the plot. E.g. I chose Plot column: car and I can see plot with sports on X axis.

Or do I miss something?

One unrelated question: Is it possible to get median when inspecting ExampleSet? I can see mean, min, max, dev in meta data view. Is there any plot showing this? Or is it ncessary to you Aggregate operator?

Thanks,

Kuba

P.S. Sorry for not using bug tracker - I've tried to register several hours ago but no email came...
Tagged:

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,525   Unicorn
    Hi,
    yes, you are right. Thanks for this hint. Please inform us if you can't login to the bug tracker, if you manage to register, please file this as bug.

    Greetings,
      Sebastian
  • xkubaxkuba Member Posts: 3 Contributor I
    I still can't register to the bug tracker - I submitted my email address two times but haven't received an activation email. I've checked spam folder as well.

    Could you answer my question regarding median, please?
    One unrelated question: Is it possible to get median when inspecting ExampleSet? I can see mean, min, max, dev in meta data view. Is there any plot showing this? Or is it necessary to use Aggregate operator?
    Thank you,

    Kuba
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,525   Unicorn
    Hi,
    I think you have to use the aggregation operator. The calculation of the median would simply take to much time (n log n as far as I know), so that you don't want to calculate it on large data sets automatically.

    Greetings,
      Sebastian
  • dan_agapedan_agape Member Posts: 106  Guru
    By the way, Sebastian, most probably there is an obvious way to do it, but can you (or someone else) please indicate how can one plot attributes with let's say weights (previously evaluated), such that the bars representing the weights are displayed in decreasing/increasing order? This would obviously help in illustrating a nice plot showing the importance of the input attributes in predicting the class attribute, for instance.

    Many thanks,
    Dan
  • dragoljubdragoljub Member Posts: 241  Maven
    Agreed! I have worked on processes where we removed and inserted attribute columns in weighted order to trick the plotter to give  us an ordered feature set. This was useful for the deviation plot especially. Having an ordered list of weights would be nice in RM. Right now I export to excel and sort the weights there.

    -Gagi
  • dan_agapedan_agape Member Posts: 106  Guru
    Gagi, save your time by not opening excel any more ;) Just found out about some operators so here's a way to nicely display decreasing bars for the most to the least predictive attributes.

    Cheers
    Dan

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.0.10" expanded="true" name="Process">
        <process expanded="true" height="415" width="748">
          <operator activated="true" class="generate_direct_mailing_data" compatibility="5.0.10" expanded="true" height="60" name="Generate Direct Mailing Data" width="90" x="45" y="120">
            <parameter key="number_examples" value="1000"/>
          </operator>
          <operator activated="true" class="select_attributes" compatibility="5.0.10" expanded="true" height="76" name="Select Attributes" width="90" x="179" y="120">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="name"/>
            <parameter key="invert_selection" value="true"/>
          </operator>
          <operator activated="true" class="weka:W-ChiSquaredAttributeEval" compatibility="5.0.1" expanded="true" height="76" name="W-ChiSquaredAttributeEval" width="90" x="313" y="120"/>
          <operator activated="true" class="weights_to_data" compatibility="5.0.10" expanded="true" height="60" name="Weights to Data" width="90" x="447" y="120"/>
          <operator activated="true" class="sort" compatibility="5.0.10" expanded="true" height="76" name="Sort" width="90" x="581" y="120">
            <parameter key="attribute_name" value="Weight"/>
            <parameter key="sorting_direction" value="decreasing"/>
          </operator>
          <connect from_op="Generate Direct Mailing Data" from_port="output" to_op="Select Attributes" to_port="example set input"/>
          <connect from_op="Select Attributes" from_port="example set output" to_op="W-ChiSquaredAttributeEval" to_port="example set"/>
          <connect from_op="W-ChiSquaredAttributeEval" from_port="weights" to_op="Weights to Data" to_port="attribute weights"/>
          <connect from_op="Weights to Data" from_port="example set" to_op="Sort" to_port="example set input"/>
          <connect from_op="Sort" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="90"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,525   Unicorn
    Hi,
    upps, seems to me I overlooked a post. Well in the meantime you have found the answer already :) And probably learned something about the benefits of the modular organization of RapidMiners functions :)

    Greetings,
      Sebastian
Sign In or Register to comment.