Options

# Sequential Supervised Learning

hi all,

I need help on how to model this particular problem in RapidMiner. Here's sample data I'm trying to model in RapidMiner:

id sequence rank

1 1020110201 40

2 0010120100 34

3 2110100110 18

4 0120010110 -13

5 0101010020 -98

6 0101210010 -21

As you can see the sequence consists of 10 digits with '0', '1', and '2' items but a sequence is associated with a single rank value (either positive or negative) e.g. 1020110201->40. As for the example above, the intention is to classify all past sequences with their corresponds ranks. So for example, given a new sequence 0101211010 the classifier should be able to predict the rank.

What is the best way to model in this rapidminer? Right now (following the neural trend tutorial) I assigned rank as the label and i used 10 different attributes to capture a single sequence but i'm not sure if this is the correct way as in my case, the sequence string exhibits significant sequential correlation.

Your help is very much appreciated.

regards,

nurman

I need help on how to model this particular problem in RapidMiner. Here's sample data I'm trying to model in RapidMiner:

id sequence rank

1 1020110201 40

2 0010120100 34

3 2110100110 18

4 0120010110 -13

5 0101010020 -98

6 0101210010 -21

As you can see the sequence consists of 10 digits with '0', '1', and '2' items but a sequence is associated with a single rank value (either positive or negative) e.g. 1020110201->40. As for the example above, the intention is to classify all past sequences with their corresponds ranks. So for example, given a new sequence 0101211010 the classifier should be able to predict the rank.

What is the best way to model in this rapidminer? Right now (following the neural trend tutorial) I assigned rank as the label and i used 10 different attributes to capture a single sequence but i'm not sure if this is the correct way as in my case, the sequence string exhibits significant sequential correlation.

Your help is very much appreciated.

regards,

nurman

0

## Answers

8Contributor II849MavenAs your premise sequence can have a very large number of permutations, each representing a signed integer label, it would help if you knew whether you could at least get the sign right. So just keep it simple, 10 nominal attributes, binary label plus/minus. Also, there is a whole heap of supporting videos and tutorials to tell you how to optimise.

In my own work I am constantly surprised at my own bias, and always go through a brutal self beat-up when reality checks in. I dread to think how many years have been spent on roads that lead nowhere, but that's another story..

5Contributor IISo, a concrete proposal would be most welcome, Hero. We are just noobs...

Regards,

ST

849MavenLooking at your posts I think you'll find the work of Kadous and Sammut on sign language recognition rather interesting ;D Rapidminer is a sort of propositional Lego; once you know what you want you clip the bits together, but 'knowing what you want' is easier said than done! Here's the link to some really good work...

http://www.springerlink.com/content/wp2506r752qv1623/