Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Samples / Help for Location Process
Hello,
I have the following task to explore:
We want to predict the position of a WiFi Client in a certain room. We have the positions of the Access-Points and the RSSI-Values (WLAN field strength).
I watched the tutorials 1-8 on youtube and tested with the decision tree in the Studio 7.1. I am a beginner in datamining and it is hard for me to rate if this is the right way. ???
Has anybody samples for the given task, or for similar tasks?
Is the "decision tree" the right process in the Studio to get a good result?
Thank you!
Regards
AlexO
I have the following task to explore:
We want to predict the position of a WiFi Client in a certain room. We have the positions of the Access-Points and the RSSI-Values (WLAN field strength).
I watched the tutorials 1-8 on youtube and tested with the decision tree in the Studio 7.1. I am a beginner in datamining and it is hard for me to rate if this is the right way. ???
Has anybody samples for the given task, or for similar tasks?
Is the "decision tree" the right process in the Studio to get a good result?
Thank you!
Regards
AlexO
Tagged:
0
Answers
thanks for trying out RapidMiner! I think there are some ways you can get better.
First of all most problems of data science are about representation of the data. How does you table look like? I assume you have something like:
Truth WIFI-Strength1, WIFI-Strength2, WIFI-Strength3
etc? Thinking about a useful representation is key.
My peronal feeling (if you have some similar representation) is, that a different model might be better. My feeling says that a Logistic Regression or SVM in a Polynominal by Binominal Classification operator might make sense.
Can you please tell us a bit more about the structure of your data?
Best,
Martin
Dortmund, Germany
Try search google scholar for RapidMiner + wifi or signal strength that should give you some pointers.
unfortunately I could not find this paper. Thank you anyway.
Alex
thanks for bolster me up. The question for the data is answered fast: I am free! I could define the data which I need.
What I will/should have is:
- The count of Access-Points (e.g. 5). Data 1 .. n
- The borders of the room I have to predict (e.g. a quad of 50x50 meters).
- "Learning data" (I am not sure how the position should be represented...)
--> I want to teach the System before any prediction
- RSSI (field strength) + Position for the Learning data
- RSSI (filed strength) without Position for the prediction
That's it.
I will be glad about freedback.
Regards
Alex
You'll probably make measurements on defined points of the building and record the coordinates or the room identifier as the target variable (label). Then you can build models from this data and apply them to new data.
You'll have a variable number of RSSIs. This is usually not easy to express in RapidMiner. So you'll probably filter for the top 3 or 5 signals and use the Pivot operator to transform the dataset so it only has one record per reading.
The variable number of RSSI's is by design. There a many effects which can change the RSSI...
So is Rapidminer the wrong projection??
Models just need to have a fixed attribute schema (in each product). They can't work with non-tabular data. Many algorithms also can't work with missing data (this is again conceptual, not a RapidMiner limitation).
Some possible solutions:
- If you have a fixed number of stations installed, your table could be like this:
Measurement ID; Position; Station1; Station2; ... StationN
If no signal strength of Station5 is available, you just put 0 into it.
RapidMiner can work well with a huge number of attributes, and the structure can be automatically created e. g. with the Pivot operator.
- If the number of stations is not fixed and higher than you'd like to express in the previous data structure, you could go with this:
Measurement ID; Position; Top1StationID; Top1StationStrength; Top2StationID; Top2StationStrength; ... as long as it makes sense.
Your ultimate requirement is to express each "example" (measurement, position) in one row in a tabular data structure. That's it.
I would guess that the first representation is easier to work with and it's also better suited for most modeling algorithms.