RapidMiner

Highlighted
Learner I

How can I have some melting function in rapidminer?

I am beginner in dataminer,

I have a list of 10000 rows and about 200 column like this :

look,1,2,3,4,5,6,7,8

book,4,5,6,7,8,102,104,107

look,6,7,8,9

hook,100,101,102

cook,7,8,9

build,102,103,104,107

hook,103,104,105

...

at first i need to make unique list of words:

look,1,2,3,4,5,6,7,8,9

book,4,5,6,7,8,102,104,107

hook,100,101,102,103,104,105

cook,7,8,9

build,102,103,104,107

Now I need to find lines with at least 3 (or n) similar values and generate a new list:

look,1,2,3,4,5,6,7,8,9

book,4,5,6,7,8,102,104,107

cook,7,8,9

*************

book,4,5,6,7,8,102,104,107

build,102,103,104,107

*************

hook,100,101,102,103,104,105

build,102,103,104,107

*************

thank you

13 REPLIES
RM Certified Expert

Re: How can I have some melting function in rapidminer?

What is melting function?
Learner I

Re: How can I have some melting function in rapidminer?

I Searched the internet and someone said python melt can help me, but I don't know how can I do in rapidminer!

RM Staff

Re: How can I have some melting function in rapidminer?

Hi,

from the pandas doc for melt:

`“Unpivots” a DataFrame from wide format to long format, optionally leaving identifier variables set.`

I guess it maps to something along the lines of De-Pivot.

Best,

Martin

--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
RM Certified Expert

Re: How can I have some melting function in rapidminer?

I guess I learned something new today!
Community Manager

Re: How can I have some melting function in rapidminer?

so that's a fun puzzle.  I would begin like this (you will need @land's Statistics Extension to run this process):

```<?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="7.6.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="7.6.001" expanded="true" height="68" name="Retrieve smmsamm" width="90" x="45" y="85">
<parameter key="repository_entry" value="smmsamm"/>
</operator>
<operator activated="true" class="de_pivot" compatibility="7.6.001" expanded="true" height="82" name="De-Pivot" width="90" x="179" y="85">
<list key="attribute_name">
<parameter key="foo" value="att[2-9]"/>
</list>
<parameter key="index_attribute" value="bar"/>
</operator>
<operator activated="true" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Select Attributes" width="90" x="313" y="85">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="bar"/>
<parameter key="invert_selection" value="true"/>
</operator>
<operator activated="true" class="numerical_to_polynominal" compatibility="7.6.001" expanded="true" height="82" name="Numerical to Polynominal" width="90" x="447" y="85">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="foo"/>
</operator>
<operator activated="true" class="rmx_stat:cross_table" compatibility="1.3.000" expanded="true" height="82" name="Extract Cross Table" width="90" x="581" y="85">
<parameter key="group_attribute_a" value="att1"/>
<parameter key="group_attribute_b" value="foo"/>
</operator>
<connect from_op="Retrieve Untitled 3smmsamm" from_port="output" to_op="De-Pivot" to_port="example set input"/>
<connect from_op="De-Pivot" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Numerical to Polynominal" to_port="example set input"/>
<connect from_op="Numerical to Polynominal" from_port="example set output" to_op="Extract Cross Table" to_port="example set input"/>
<connect from_op="Extract Cross Table" from_port="cross table output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
```

That said I am certain there is a cleverer way to do this!

Scott

Scott Genzer
Senior Community Manager
RapidMiner, Inc.
Learner I

Re: How can I have some melting function in rapidminer?

I updated my rapidminer and installed statics extension:

but I Get error:

and I can not find missing extension:

Thank you

Community Manager

Re: How can I have some melting function in rapidminer?

hmm I'm not sure the extension in the marketplace is up-to-date (Sebastian?).  I would go directly to the website: https://oldworldcomputing.com/products/statistics-extension-for-rapidminer

Scott

Scott Genzer
Senior Community Manager
RapidMiner, Inc.
Learner I

Re: How can I have some melting function in rapidminer?

This is my csv file.
would you please test with it?

Community Manager

Re: How can I have some melting function in rapidminer?

so the process I posted was not intended to be a finished product - just something to get you in the right direction.    If you take that csv file and put it in my process, you get the attached result.

Scott

Scott Genzer
Senior Community Manager
RapidMiner, Inc.
Polls
How can RapidMiner increase participation in our new competitions?