The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.
Options

# Related attributes

Member Posts: 2 Contributor I
edited November 2018 in Help
How do I solve the problem of related attributes?

For example I have a dataset with the following structure:
 Economic Code 1 Percentage Economic Code 2 Percentage Economic Code 3 Percentage Success EC12 60% EC13 30% EC14 10% Yes EC13 60% EC15 20% EC12 20% No EC19 50% EC13 50% EC14 0% Yes
The first and second attribute are related. The second attributes shows how much percentage of weightage should be given to the first. Similarly for the third and fourth. An attribute should be neglected if the value of weightage is zero.

How do I go about solving this? Any tips or suggestions are welcome.

• Options
Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
Hi,

well, there are a lot of options but two of them are pretty simple and worth a try:
• If the values EC12, EC13 etc. are actually ordered, you could try to translate those ordinal values into numerical ones and directly calculate a new attribute by, for example, multiplying the weights with the numerical values.
• The second option is much easier and would definitely my first choice: why do you bother at all? Just let the modeling scheme sort this out and simply put you data into a classification scheme.
The rational for the second suggestion is simply that many data mining schemes are capable of handling those feature interactions themself. Let's take a decision tree for example. The resulting model could look like

if "Percentage of EC 1" > 80%
--- if "EC1" = EC12 then Yes
--- else No
else if ...

I hope you get the point. This is even more true for other learning schemes which take multiple attributes into account at the same time. Those often lack understandibility though, so this might not be an option then.

Cheers,
Ingo