RapidMiner

Highlighted
Learner I smmsamm
Learner I

How can I have some melting function in rapidminer?

I am beginner in dataminer,

I have a list of 10000 rows and about 200 column like this :

 

look,1,2,3,4,5,6,7,8

book,4,5,6,7,8,102,104,107

look,6,7,8,9

hook,100,101,102

cook,7,8,9

build,102,103,104,107

hook,103,104,105

...

 

at first i need to make unique list of words:

look,1,2,3,4,5,6,7,8,9

book,4,5,6,7,8,102,104,107

hook,100,101,102,103,104,105

cook,7,8,9

build,102,103,104,107

 

Now I need to find lines with at least 3 (or n) similar values and generate a new list:

 

look,1,2,3,4,5,6,7,8,9

book,4,5,6,7,8,102,104,107

cook,7,8,9

*************

book,4,5,6,7,8,102,104,107

build,102,103,104,107

*************

hook,100,101,102,103,104,105

build,102,103,104,107

*************

 

Please help me in anyway

thank you

13 REPLIES
RM Certified Expert
RM Certified Expert

Re: How can I have some melting function in rapidminer?

What is melting function?
Learner I smmsamm
Learner I

Re: How can I have some melting function in rapidminer?

I Searched the internet and someone said python melt can help me, but I don't know how can I do in rapidminer!

RM Staff
RM Staff

Re: How can I have some melting function in rapidminer?

Hi,

from the pandas doc for melt:

“Unpivots” a DataFrame from wide format to long format, optionally leaving identifier variables set.

I guess it maps to something along the lines of De-Pivot.  

Best,

Martin

--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
RM Certified Expert
RM Certified Expert

Re: How can I have some melting function in rapidminer?

I guess I learned something new today!
Community Manager Community Manager
Community Manager

Re: How can I have some melting function in rapidminer?

so that's a fun puzzle.  I would begin like this (you will need @land's Statistics Extension to run this process):

 

<?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.6.001" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="retrieve" compatibility="7.6.001" expanded="true" height="68" name="Retrieve smmsamm" width="90" x="45" y="85">
        <parameter key="repository_entry" value="smmsamm"/>
      </operator>
      <operator activated="true" class="de_pivot" compatibility="7.6.001" expanded="true" height="82" name="De-Pivot" width="90" x="179" y="85">
        <list key="attribute_name">
          <parameter key="foo" value="att[2-9]"/>
        </list>
        <parameter key="index_attribute" value="bar"/>
      </operator>
      <operator activated="true" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Select Attributes" width="90" x="313" y="85">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="bar"/>
        <parameter key="invert_selection" value="true"/>
      </operator>
      <operator activated="true" class="numerical_to_polynominal" compatibility="7.6.001" expanded="true" height="82" name="Numerical to Polynominal" width="90" x="447" y="85">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="foo"/>
      </operator>
      <operator activated="true" class="rmx_stat:cross_table" compatibility="1.3.000" expanded="true" height="82" name="Extract Cross Table" width="90" x="581" y="85">
        <parameter key="group_attribute_a" value="att1"/>
        <parameter key="group_attribute_b" value="foo"/>
      </operator>
      <connect from_op="Retrieve Untitled 3smmsamm" from_port="output" to_op="De-Pivot" to_port="example set input"/>
      <connect from_op="De-Pivot" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
      <connect from_op="Select Attributes" from_port="example set output" to_op="Numerical to Polynominal" to_port="example set input"/>
      <connect from_op="Numerical to Polynominal" from_port="example set output" to_op="Extract Cross Table" to_port="example set input"/>
      <connect from_op="Extract Cross Table" from_port="cross table output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

That said I am certain there is a cleverer way to do this!


Scott

 

Scott Genzer
Senior Community Manager
RapidMiner, Inc.
Learner I smmsamm
Learner I

Re: How can I have some melting function in rapidminer?

I updated my rapidminer and installed statics extension:

!error0.jpg

but I Get error:

!error1.jpg
and I can not find missing extension:

!error2.jpg

Would you please help again.

Thank you

Community Manager Community Manager
Community Manager

Re: How can I have some melting function in rapidminer?

hmm I'm not sure the extension in the marketplace is up-to-date (Sebastian?).  I would go directly to the website: https://oldworldcomputing.com/products/statistics-extension-for-rapidminer

 

Scott

Scott Genzer
Senior Community Manager
RapidMiner, Inc.
Learner I smmsamm
Learner I

Re: How can I have some melting function in rapidminer?

This is my csv file.
would you please test with it?

Community Manager Community Manager
Community Manager

Re: How can I have some melting function in rapidminer?

so the process I posted was not intended to be a finished product - just something to get you in the right direction.  Smiley Happy  If you take that csv file and put it in my process, you get the attached result.

 

Scott

Scott Genzer
Senior Community Manager
RapidMiner, Inc.
Polls
How can RapidMiner increase participation in our new competitions?
Twitter Feed