Options

# Preparing data for pattern recogintion

Hello again, I'm still new to rapidminer, so please be passioned

Currently I am writing my master thesis in electrical engineering. I guess using rapidminer fits perfectly for some excellent simulating results. I guess I could write something about the simulation in my thesis. But first of all here some information about my data: I have a database resp. a set of training data, that looks like this:

There are multiple containers (1..n). Each container has multiple measurements (1..m). Each measurements consists of x-y-z data. The first measurement can have 100values for each component x,y and z. The second measurement might have 130 values for each x-y-z component...

container-1 [ measurement-1[x,y,z], measurement-2[x,y,z], measurement-3[x,y,z], ..., measurement-m[x,y,z]]

container-2 [ measurement-1[x,y,z], measurement-2[x,y,z], measurement-3[x,y,z], ..., measurement-m[x,y,z]]

...

container-n [ measurement-1[x,y,z], measurement-2[x,y,z], measurement-3[x,y,z], ..., measurement-m[x,y,z]]

On the other hand I have on measurement, which will be tested against the training database to classify, if my measurement-x attends to container 1,2, .. n...

My question is, how do I have to setup my CSV or Excel file for the database?! And how can I test a measurement against my database set? I think I have to use x-validation, right? If you need more information about my project, dont hesitate and ask

Currently I am writing my master thesis in electrical engineering. I guess using rapidminer fits perfectly for some excellent simulating results. I guess I could write something about the simulation in my thesis. But first of all here some information about my data: I have a database resp. a set of training data, that looks like this:

There are multiple containers (1..n). Each container has multiple measurements (1..m). Each measurements consists of x-y-z data. The first measurement can have 100values for each component x,y and z. The second measurement might have 130 values for each x-y-z component...

container-1 [ measurement-1[x,y,z], measurement-2[x,y,z], measurement-3[x,y,z], ..., measurement-m[x,y,z]]

container-2 [ measurement-1[x,y,z], measurement-2[x,y,z], measurement-3[x,y,z], ..., measurement-m[x,y,z]]

...

container-n [ measurement-1[x,y,z], measurement-2[x,y,z], measurement-3[x,y,z], ..., measurement-m[x,y,z]]

On the other hand I have on measurement, which will be tested against the training database to classify, if my measurement-x attends to container 1,2, .. n...

My question is, how do I have to setup my CSV or Excel file for the database?! And how can I test a measurement against my database set? I think I have to use x-validation, right? If you need more information about my project, dont hesitate and ask

0

## Answers

60Contributor IIRegarding the database, look at the CSVReader or ExcelReader operators. They can read in Excel sheets.

X-Validation or cross-validation is a method of testing the strength of a model. It's not used in classifying new data.