Options

# predicting float label which depends on polynominal attributes

Hi there!

I am new to rapidminer and to datamining. Rapidminer is my first dataminig tool and im very pleased with it, it is very good for newcomers, but i have some problems and i need someone who is an expert to help me

I have a data in excel that looks like this:

variable X -

variable Y

variable Z

variable Y

variable Z

i want to predict X depending on Y

first i thought to use linear regression with converting nominal to binominal first (dummy coding), but rapidminer made an output of only 2 variables in model that X depends on, instead of all 4 (and it has no sense that others have no influance on X)

ive also used weka library linear regression, without dummy coding first, got same result

any help? can you point me how to setup this? is there some other algorithms for this problem (label is a number, but attributes are polynominal)?

i hope im making it clear whats my problem

thank you

mladen

I am new to rapidminer and to datamining. Rapidminer is my first dataminig tool and im very pleased with it, it is very good for newcomers, but i have some problems and i need someone who is an expert to help me

I have a data in excel that looks like this:

variable X -

**lable**, its an float variable (student grade, for an example 3.73)variable Y

_{1}-**attribute**, nominal value (can have 4 values that ive coded in numberes: 1, 2, 3, 4)variable Z

_{1}-**attribute**, nominal value (can have 6 values taht ive coded in numbers from 1 to 6)variable Y

_{2}-**attribute**, nominal value (can have 4 values that ive coded in numberes: 1, 2, 3, 4)variable Z

_{2}-**attribute**, nominal value (can have 6 values taht ive coded in numbers from 1 to 6)i want to predict X depending on Y

_{1}, Z_{1}, Y_{2}, Z_{2}first i thought to use linear regression with converting nominal to binominal first (dummy coding), but rapidminer made an output of only 2 variables in model that X depends on, instead of all 4 (and it has no sense that others have no influance on X)

ive also used weka library linear regression, without dummy coding first, got same result

any help? can you point me how to setup this? is there some other algorithms for this problem (label is a number, but attributes are polynominal)?

i hope im making it clear whats my problem

thank you

mladen

0

## Answers

114RM Data ScientistThe reason for this problem is the build-in feature selection of the linear regression methods. By default

M5 primeis used in both cases (RM and Weka). Simply turn it off (RM: feature selection = none, Weka: S = 1.0) and you should receive a model that refers to more than two attributes.Greetings,

Helge

3Contributor Isorry for late response, I had some exams and I was not around my computer. Your advice helped, RapidMiner managed to output all variables. I have some questions concerning the output, since im not sure if RapidMiner is using dummy coding as binary. I want to calculate residuals. I'll post result and explain the question better later when i get to my desktop PC.

Mladen