"defining new variables (text-mining)"

derchiefderchief Member Posts: 5 Contributor II
edited May 2019 in Help
hi,

I would like to define some additional variables for comparing groups of textfiles. for example, I would like to differentiate between texts written by certain age-groups as well as between male and female sources. the result should allow interpretations like "older female people use this words" or "women tend to that explanation". consequently, the mentioned variables should be choosable in the plot views.


thanx and best regards,

chris

Answers

  • fischerfischer Member Posts: 439 Maven
    Dear Chris,

    can you elaborate on what exactly your problem is? Once the vartiables are in your data set they should be appearing in the plot views. You might want to use an ExampleFilter to train different models for subsets of your data defined by values of these variables or ChangeAttributeRole to declare these variables as the label.

    Cheers,
    Simon
  • derchiefderchief Member Posts: 5 Contributor II
    Dear Simon,

    I´m about to solve the problem! I added an attribute (gender), that is choosable in the plot view. Is it possible now to set a value (eg. "male") for a selection of examples in rapidminer or do I have to change the values in the dat-file using another application such as excel?

    Cheers,
    Chris
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Chris,
    for this purpose I would recommend the AttributeConstruction operator. There you can specify a list of attributes to construct and insert some scripting language style commands for setting the value. For example the operator provides something like conditions (if) or mathematical constructions.
    One hint on the usage: Strings (and hence nominal value) have to be included into ".

    Greetings,
      Sebastian
Sign In or Register to comment.