RapidMiner 9.8 continues to innovate in data science collaboration, connectivity and governance


"Kalman Filtering in RapidMiner versus the Alternatives"

simonBsimonB Member Posts: 1 Contributor I
edited June 2019 in Help
I'm relatively new to Rapidminer and am trying to do the following, which I think needs some sort of Kalman filtering, which I believe is not available directly in Rapidminer, but may be through an R addin.
THE PROBLEM: I'm looking at AIS ship data and trying to identify what cargo is in any oil tanker. I can use the draft of the ship and calculate directly the density of what's in it as I know the volume (Archimedes principle ). However there is noise in this draft observation (hopefully random noise). I can compare observed (noisy) density with a list of say 5 possible cargo densities and calculate probability of each cargo (assuming I know the variance of the error). I then calculate these probabilities historically for lots of ships or different sizes and other parameters making particular A-B journeys. I can then construct a covariance matrix of how these other parametersย  affect the likelihood of a ship carrying a particular cargo, which perhaps say depends on ship size, journey beginning and endpoints, etc. I probably do some sort of k-cluster classification here also.

I then construct a model based on this covariance matrix , and the observed clustering,ย  of what things, excluding draft, affect the chance of a ship carrying a cargo.

Now I would like to "adjust" the original draft data , assuming my model is correct. i.e. I see a draft of 13.4 metres which means most likely cargo is gasoline, but there is still a reasonable change that cargo is not gasoline but diesel or jet. I then adjust draft observation to 13.6 metres to make Diesel most likely , based on my model which says more ships of this size/journey details, etc carry diesel than Gasoline. Obviously I'm creating a feedback loop here, and I think Kalman filtering is the way to go; however there is no Kalman filtering in Rapidminer.

The question is whether I can adapt any of the existing machine learning algorithms in Rapidminer to achieve the same feedback effect. I somehow doubt it, but wondered if anyone has done something similar.

Thanks in Advance,

Sign In or Register to comment.