Loop Files.....Windows to Linux differences
I just installed Rapidminer on linux (Solus) using Sun's JDK and copied over a repository from a Windows machine and everything seems to work except one problem that I don't fully understand.
When I loop files and read a directory of csv's with file names like 20170101.csv, 20170102.csv etc they all come in out of order. This never happened on the Windows machine. I have tried renaming and playing around with the encoding settings but I am not making much progress.
Before I reformat a go back to Windows 10, does anyone have any ideas?
Many thanks,
Alex
Edit...I have attached a couple of screenshots that might explain the problem better.
Best Answer
-
BalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 Unicorn
Hi,
the ordering of files in directory lookups is not guaranteed on Linux. Maybe not even on Windows - you might have just gotten lucky. (For example because the files were created in chronological order.)
If you require a special ordering (most users don't), do it in two steps:
1. Use Loop Files and retrieve the file names. Put them into an example set and sort it.
2. Use Loop Values (without parallel processing!) on the sorted example set and use the macro value as the file name.
Regards,
Balázs
2
Answers
Thanks Balázs,
I will see what I can do.
Kind regards,
Alex