Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Append rows with diffrent number of atribiuts
yerisderanak
Member Posts: 2 Learner II
Hi guys!
First, im a beginner with RapidMiner so please, be patient.
I have sets of data that describes movement. I was able to create a single row form every sample (with windowing), but the thing is, every movement had diffrent lenght, so number of rows varies greatly (1k-2k of difference), and so now, I would like to append all of that data to create one nice training set, but i can't due to the diffrence. I know that I can create empty atributes columns, but doing that by hand sounds imposible. Can I do it in some "smart way"? I dont want to retransfome my date to uniform size, as lack of atribute is a great information about the movement lenght and dynamic.
0
Answers
Hi,
I have always appended data sets that have the same number of columns/attributes. Have you tried using the generate attributes operator to create additional attributes to ensure each data set has the same number of attributes? To denote lack of attributes, you you could use binary values; 1 for presence of an attribute and 0 if an attribute doesn't exist. I hope this helps.
Have you tried 'append' or 'union'?
If all the samples have the same attributes, 'append' operator can be used to build a merged ExampleSet from two or more compatible ExampleSets by adding all examples into a combined set.
http://docs.rapidminer.com/studio/operators/blending/table/joins/append.html
Union operator builds the superset of features of both input ExampleSets such that all regular attributes of both ExampleSets are part of the superset. If there is any column not available in one sample, it will create missing values in the merged ExampleSet.
http://docs.rapidminer.com/studio/operators/blending/table/joins/union.html