Options

Apply Model with less Attributes

t_liebet_liebe Member Posts: 14 Contributor I
edited December 2018 in Help

Hey guys,

given a working model, is it possible to apply the model on an example set with less/different attributes? I have a lot of data on which I can build the model. However, I want to make a prediction on data where this data is not yet available.

Example:

IDHappy (Label)TextLegal Age
1trueLorem ipsum dolor sit amet, …true
2falseLorem ipsum dolor sit amet, …false
3falseLorem ipsum dolor sit amet, …true
4true Lorem ipsum dolor sit amet, …true
5falseLorem ipsum dolor sit amet, …false

As you can see, legal age correlates with the Label, but when I want to apply the model on data sets, I only have the texts. Is that possible ?

Thank you for your help.

Tagged:

Answers

  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,507 RM Data Scientist
    Hi,
    what should the model do in this case if it needs it? Some models can add missings here and evaluate like the Age would be missing.
    BR,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • Options
    t_liebet_liebe Member Posts: 14 Contributor I

    The Problem is that the Attribute can't be evaluated easily.
    Maybe I can explain my Goal with another example:

    ID Blue Yellow Brown Green (Label) 1 1 1 0 1 2 0 1 0 0 3 1 0 0 0 4 1 1 1 0 5 0 1 1 0 6 1 0 1 0 ID Blue Yellow Green (Label) 1 1 1 50% 1; 50% 0 2 0 1 0 3 1 0 0

    Although Attribute Brown is missing, you can still make a prediction based on the data set before. I know this might not be possible, but I thought it is worth a try to ask.

    Kind regards,
    Tobias

  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,507 RM Data Scientist
    Hi,
    don't you want to built a second model without the missing attribute in and then select during application which model to take?
    BR,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    I agree, this seems like the case where a segmented scorecard based on underlying data availability would be the best solution.  This is a fairly typical setup in my experience.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.