"Integer vs float performance"

wessel · November 2010

Assume that you have a lossless way to convert your data from floats to integers.

Would this speed up your rapid-miner process?
And what about memory usage?

If so, what algorithms would mostly benefit from doing all calculations on integers?

I found this table on the internet:
Comparison of Pentium Floating Point and Integer Speeds
Operation Floating Point clocks Integer Clocks
add 1-3 1-3
multiply 1-3 10-11
division 39-42 22-46
convert 6 (double to long) 3 (long to double)

Is this true always?

IngoRM · November 2010

Hi Wessel,

I am afraid I cannot say much about runtime. Looking at the table you provided it indeed could be that some calculations are performed quicker. But I would expect that many of the calculations done internally are performed on a double base anyway so this probably would not really help. If we calculate a linear regression, for example, the data is transformed to a double matrix which is then inverted and there will no runtime improvement then.

What is true is that the amount of used memory should be approximately reduced to the half when you change the data management to integer instead of double. The same would be true for float instead of double since only 4 bytes are used in both cases instead of the 8 bytes for double. We actually had one RapidMiner version (4.0 or 4.1 if I remember correctly) where the default data management was set to float. But it turned out that for many applications the precision was not high enough, especially for larger numbers, and for that reason we changed the default back to double.

Cheers,
Ingo

Preko · November 2010

Hi,

I remember that there are some operators where we can set data management to integer or float, but I cannot find those parameters in the current release. I was looking for it in e.g Read CSV. How can I set data management in this case?

Thanks, Zoltan

IngoRM · December 2010

Hi Zoltan,

you are right. The parameter is still there for several input operators but it is missing now for CSV, Excel, Database, and Arff for some reason. I have opened a bug report at

http://bugs.rapid-i.com/show_bug.cgi?id=446

Cheers and thanks for pointing this out,
Ingo

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

"Integer vs float performance"

Answers