Options

"RegularExpression-bug?"

choose_usernamechoose_username Member Posts: 33 Contributor II
edited June 2019 in Help
Hi there,

i have a Data set which got lines looking like the following:

41, Private, 109912, Bachelors, 13, Never-married, Other-service, Not-in-family, White, Female, 0, 0, 40, ?, <=50K


From time to time there is a '?'      . I wanted to replace it but RapidMiner didnt recognize it as a charakter.

I used the Replace-Operator and wrote in "replace what"      a 'any character'- character from the regular expressions suggestions window. All other charaters were replaced but not the '?'.

Now i wanted to know if that is because of a bug or did i smth wrong ?


Greetings

User
Tagged:

Answers

  • Options
    choose_usernamechoose_username Member Posts: 33 Contributor II
    I found the following out:

    if i change the ending of the file to  .arff then i cant filter the '?' out


    if i change the ending of the file to  .csv then the '?' gets filtered out.


    Maybe this helps.

    ______________________

    User.
  • Options
    cherokeecherokee Member Posts: 82 Maven
    Hi choose_username,

    RapidMiner uses '?' to denote a missing value. As there is no value given you cannot replace it (there is none). You can use either Replace Missing Values or Impute Missing Values, both from Data Transformation.Data Cleansing!

    Best regards,
    chero
Sign In or Register to comment.