Options

[SOLVED] Assigning value types

steinarsteinar Member Posts: 5 Contributor II
edited November 2018 in Help
Hi

I'm just beginning with RapidMiner, but have gone through some tutorials. None seem to address this problem.

The data set I'm working with is huge and has over 1000 attributes. When I import it from a csv file, RM guesses the attribute types. The majority of my attributes should be binomial, but RM thinks they are integers. Is there a way to assign them all as binomial at once without having to click on each column and choose binomial?

Also, is there a way to change these definitions once the data has been stored in a repository?

Best regards,
Steinar

Answers

  • Options
    MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi Steinar,

    I currently know of no possibility to automatically assign types to the attributes in the Read CSV operator, which are different to those guessed by RapidMiner. But you can use the Parse Numbers operator to convert all or some nominal attributes to numbers.

    Cheers,
    Marius
  • Options
    steinarsteinar Member Posts: 5 Contributor II
    Thank you. I guess I'll just have to manually click then. It only takes me about half an hour for each set...

    I was hoping there was something like in RDS in which you can define the attributes in your spreadsheet program and then just define the row like you do with name and unit in RM.
  • Options
    MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Sorry, I read something wrong in your first post. Maybe the Numerical to Binominal operator might help you. Just have a look in the Data Transformation group in the operator tree view.

    It is currently not possible to add the attribute type as annotation. Maybe this could be a useful feature for the future.

    Regards,
    Marius
  • Options
    steinarsteinar Member Posts: 5 Contributor II
    The operator you pointed towards does work for this, but more importantly it led me to understand that binomial data in RM is True/False, not 1/0. All I had to do was replace all instances of "1" with "true" in excel and then RM identified the data correctly as binomial.

    Thanks, now I can focus on my next problems.
  • Options
    MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Basically, a binominal attribute in RapidMiner is every nominal attribute with exactly two values, no matter if they are 0/1, true/false or cats/dogs. The problem is that the csv parser assumes numerical values if it sees numerical input like 0/1.
Sign In or Register to comment.