ANNOUNCEMENT: RAPIDMINER 9.1 HAS BEEN RELEASED TODAY - DEC 13, 2018!   PLEASE DOWNLOAD AND GIVE FEEDBACK. ENJOY AND HAPPY RAPIDMINING!   -- @sgenzer – Community Manager

# How can I have some melting function in rapidminer?

Member Posts: 7 Contributor I
edited November 30 in Help

I am beginner in dataminer,

I have a list of 10000 rows and about 200 column like this :

look,1,2,3,4,5,6,7,8

book,4,5,6,7,8,102,104,107

look,6,7,8,9

hook,100,101,102

cook,7,8,9

build,102,103,104,107

hook,103,104,105

...

at first i need to make unique list of words:

look,1,2,3,4,5,6,7,8,9

book,4,5,6,7,8,102,104,107

hook,100,101,102,103,104,105

cook,7,8,9

build,102,103,104,107

Now I need to find lines with at least 3 (or n) similar values and generate a new list:

look,1,2,3,4,5,6,7,8,9

book,4,5,6,7,8,102,104,107

cook,7,8,9

*************

book,4,5,6,7,8,102,104,107

build,102,103,104,107

*************

hook,100,101,102,103,104,105

build,102,103,104,107

*************

thank you

Tagged:

• RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,762   Unicorn
What is melting function?
• Member Posts: 7 Contributor I

I Searched the internet and someone said python melt can help me, but I don't know how can I do in rapidminer!

• Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 1,829  RM Data Scientist

Hi,

from the pandas doc for melt:

`“Unpivots” a DataFrame from wide format to long format, optionally leaving identifier variables set.`

I guess it maps to something along the lines of De-Pivot.

Best,

Martin

- Head of Data Science Services at RapidMiner -
Dortmund, Germany
• RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,762   Unicorn
I guess I learned something new today!
• Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager Posts: 1,886  Community Manager

so that's a fun puzzle.  I would begin like this (you will need @land's Statistics Extension to run this process):

`<?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">  <context>    <input/>    <output/>    <macros/>  </context>  <operator activated="true" class="process" compatibility="7.6.001" expanded="true" name="Process">    <process expanded="true">      <operator activated="true" class="retrieve" compatibility="7.6.001" expanded="true" height="68" name="Retrieve smmsamm" width="90" x="45" y="85">        <parameter key="repository_entry" value="smmsamm"/>      </operator>      <operator activated="true" class="de_pivot" compatibility="7.6.001" expanded="true" height="82" name="De-Pivot" width="90" x="179" y="85">        <list key="attribute_name">          <parameter key="foo" value="att[2-9]"/>        </list>        <parameter key="index_attribute" value="bar"/>      </operator>      <operator activated="true" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Select Attributes" width="90" x="313" y="85">        <parameter key="attribute_filter_type" value="single"/>        <parameter key="attribute" value="bar"/>        <parameter key="invert_selection" value="true"/>      </operator>      <operator activated="true" class="numerical_to_polynominal" compatibility="7.6.001" expanded="true" height="82" name="Numerical to Polynominal" width="90" x="447" y="85">        <parameter key="attribute_filter_type" value="single"/>        <parameter key="attribute" value="foo"/>      </operator>      <operator activated="true" class="rmx_stat:cross_table" compatibility="1.3.000" expanded="true" height="82" name="Extract Cross Table" width="90" x="581" y="85">        <parameter key="group_attribute_a" value="att1"/>        <parameter key="group_attribute_b" value="foo"/>      </operator>      <connect from_op="Retrieve Untitled 3smmsamm" from_port="output" to_op="De-Pivot" to_port="example set input"/>      <connect from_op="De-Pivot" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>      <connect from_op="Select Attributes" from_port="example set output" to_op="Numerical to Polynominal" to_port="example set input"/>      <connect from_op="Numerical to Polynominal" from_port="example set output" to_op="Extract Cross Table" to_port="example set input"/>      <connect from_op="Extract Cross Table" from_port="cross table output" to_port="result 1"/>      <portSpacing port="source_input 1" spacing="0"/>      <portSpacing port="sink_result 1" spacing="0"/>      <portSpacing port="sink_result 2" spacing="0"/>    </process>  </operator></process>`

That said I am certain there is a cleverer way to do this!

Scott

• Member Posts: 7 Contributor I

I updated my rapidminer and installed statics extension:

but I Get error:

and I can not find missing extension:

Thank you

• Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager Posts: 1,886  Community Manager

hmm I'm not sure the extension in the marketplace is up-to-date (Sebastian?).  I would go directly to the website: https://oldworldcomputing.com/products/statistics-extension-for-rapidminer

Scott

• Member Posts: 7 Contributor I

This is my csv file.
would you please test with it?

• Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager Posts: 1,886  Community Manager

so the process I posted was not intended to be a finished product - just something to get you in the right direction.    If you take that csv file and put it in my process, you get the attached result.

Scott

• Member Posts: 7 Contributor I

Oh thank you sir, You are the master
but These were samples data for test
my real data have about 100000 difeerent value, with this method I will have about 100000 Columns?
Is it possible to convert the list to my wanted list?

look,1,2,3,4,5,6,7,8,9

book,4,5,6,7,8,102,104,107

cook,7,8,9

*************

book,4,5,6,7,8,102,104,107

build,102,103,104,107

*************

hook,100,101,102,103,104,105

build,102,103,104,107

*************

• Member Posts: 7 Contributor I

I mean these coloums convert to rows with header values?

• Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager Posts: 1,886  Community Manager

Your flattery is noted and not deserved.  There are many here who are far more masterful than I.  That said, I think at this point I would recommend getting more knowledgable with RapidMiner Studio before moving forward with large data sets like the one you describe - actions such as renaming attributes and so forth are the beginning of a long journey.  I would highly recommend starting with the "Getting Started with RapidMiner" YouTube playlist.  The whole beauty of RapidMiner is that you can learn to create your own processes and be a master yourself!

Scott

• RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,508   Unicorn

Hi all,

I just published the most recent version of our extensions on the marketplace. So if that was the problem, it should be gone now. At least I can use it with the most recent version of RM.

Greetings,

Sebastian