RapidMiner

Loop Files.....Windows to Linux differences

SOLVED
Contributor II hughesfleming68
Contributor II

Loop Files.....Windows to Linux differences

I just installed Rapidminer on linux (Solus) using Sun's JDK and copied over a repository from a Windows machine and everything seems to work except one problem that I don't fully understand.

 

When I loop files and read a directory of csv's with file names like 20170101.csv, 20170102.csv etc they all come in out of order. This never happened on the Windows machine. I have tried renaming and playing around with the encoding settings but I am not making much progress.

 

Before I reformat a go back to Windows 10, does anyone have any ideas?

 

Many thanks,

 

Alex

 

Edit...I have attached a couple of screenshots that might explain the problem better.

 

 

2 REPLIES
RM Certified Expert
RM Certified Expert
Solution

Re: Loop Files.....Windows to Linux differences

Hi,

 

the ordering of files in directory lookups is not guaranteed on Linux. Maybe not even on Windows - you might have just gotten lucky. (For example because the files were created in chronological order.)

 

If you require a special ordering (most users don't), do it in two steps: 

1. Use Loop Files and retrieve the file names. Put them into an example set and sort it.

2. Use Loop Values (without parallel processing!) on the sorted example set and use the macro value as the file name.

 

Regards,

Balázs

--
Balázs Bárány
Data Scientist, Vienna
https://datascientist.at
Highlighted
Contributor II hughesfleming68
Contributor II

Re: Loop Files.....Windows to Linux differences

Thanks Balázs,

 

I will see what I can do. 

 

Kind regards,

 

Alex