The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.
Matching Lists
Hi,
I was wondering if anybody could point me in the right direction on how to do this in rapidminer. I'm new to the program and feel hopelessly lost.
I have 18 lists each containing different items and one master list containing all the items that could exist on any one of the 18 lists.
Ex:
List 1 List 2 Master List
A B A
C A B
D E C
E D D
F F
I need to use Rapid Miner to compare each individual list to the master list and see what matches and then output which items match from which list
Ex.
Item List
E List 1, List 2
C List 1
F List 2
If anybody could point me in the right direction for how to start this process I would be forever in debt.
Thanks.
Sincerely,
Ram
I was wondering if anybody could point me in the right direction on how to do this in rapidminer. I'm new to the program and feel hopelessly lost.
I have 18 lists each containing different items and one master list containing all the items that could exist on any one of the 18 lists.
Ex:
List 1 List 2 Master List
A B A
C A B
D E C
E D D
F F
I need to use Rapid Miner to compare each individual list to the master list and see what matches and then output which items match from which list
Ex.
Item List
E List 1, List 2
C List 1
F List 2
If anybody could point me in the right direction for how to start this process I would be forever in debt.
Thanks.
Sincerely,
Ram
0
Answers
This is not a common task, hence there is no single operator which does all the magic at once for you, so I am afraid you have to write some code doing it (btw: if you know another programming language, you can of course do it before loading the data into RM ...).
Here is outline what you can do:
- Create a new Set (the one which shall contain the output at the end) using e.g. "Generate Nominal Data"
- Load the original data
- use the operator "Execute Script" with both sets as input to execute a groovy script which ...
1.Iterate over the the original set and create a set of items of every list and the masterlist
2.Iterate over the master list, for every item check whether it is in the list-sets or not and fill the output set accordingly.
You may find this blog-post about the groovy-functionality helpful
http://rapid-i.com/component/option,com_myblog/task,tag/category,Script/Itemid,172/lang,en/
hope this was helpful,
steffen
Thanks for this help and the welcoming . I think these suggestions will be useful.
Ram