Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Getting started with my data set in RM 5.0

cassiuscassius Member Posts: 1 Learner III
Hi all,

I'm new to RapidMiner and data mining (although I've done what, in retrospect, was some very basic data mining in the past). I do have some university level statistics under my belt, but that is about it.

I've created a data set that I would like to work on. In general terms, I have two numeric inputs which more or less follow a linear regression. Now as for the more or less, I have a handful of non-numeric categorizations for associated with each data pair on the linear regression. I suspect these non-numerics will explain some of the directional wobble around the regression line (if that makes sense) and so I would like to run some data mining trials against the data.

Now, from what I can understand, this data is 'polynominal' according to RapidMiner so I am having a difficult time finding a mining function that works with the data set I've described. What are some good options for me to start with?

Thanks in advance.

Answers

  • haddockhaddock Member Posts: 849 Maven
    Ave Cassius!

    I came to Rapidminer primarily because it provided a nice environment for testing Support Vector Machines against large stacks of data. Why SVMs? Partly because of the speed compared to induction or neural nets, partly because they avoided the dreaded neural local pothole problem, and partly because they are like swiss army knives and can handle just about any combo of data types. The weird thing is that it worked as I had  planned, because well tuned SVMs are competitive, and because RM enables testing harnesses to be implemented quickly, even by mental midgets such as myself.


  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    you could transform the polynominal attributes with the polynominal to binominal to binominal attributes. You can turn these to binary 0 - 1 coded attributes that can be used by numerical methods like SVMs. This is a common way how to handle these attributes.

    Greetings,
      Sebastian
Sign In or Register to comment.