Very strange - parameter changes during csv generation

spokspokspokspok Member Posts: 1 Contributor I
edited November 2019 in Help

I´m rather new and try text mining with some 100s of documents. I tokenize, filter stopwords, use porter stemmer, filter by lenght, transform to lower case and wright the result to csv.

In the csv file some stemmed words have propabilities in the range of 10e+11, althought they have propabilities well below 1 in the example set table in rapidminer (as it should be to my opinion).

This effect is reproducable and seems to happen cumulative at certain parameters (words).

The effect occurs also if I copy and paste the example set matrix from rapidmier into excel.

What goes wrong??????

Many thanks for any assistance


  • die_eikedie_eike Member Posts: 10 Contributor II
    Hi, I know this post is very old, but I recently had the same problem. It is probably a bug. I compared the results in both rapidminer and csv. Its an error with the engineering notation. In rapidminer, let's suppose the results are written in engineering notation with a MINUS (e.g. 10E-11). When writing these results to csv, they suddenly change to PLUS (e.g.10E+11). That's the error. Any solutions, rapidminer team?
  • Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,985 RM Engineering
    edited August 2019

    May I ask which RapidMiner version you are using? I tested this against the latest version, and it runs fine. I created an example set with the Data Editor, and simply added 2 rows, one with 1E+11 and one with 1E-11, both in a numerical and in a nominal column. The CSV result is as expected:

Sign In or Register to comment.