RapidMiner

RapidMiner

StringIndexOutOfBoundsException ?

Regular Contributor

StringIndexOutOfBoundsException ?

Hi everyone.  I have been doing some work with RM over the past few months and I keep running into a problem that I can only make go away by magic.  Using 32-bit RM 4.2 under xp

[tt]
P Jul 29, 2008 3:53:52 PM: [Fatal] StringIndexOutOfBoundsException occured in 1st application of Input (ExampleSource)
P Jul 29, 2008 3:53:52 PM: [Fatal] Process failed: operator cannot be executed. Check the log messages...
          Root[1] (Process)
here ==> +- Input[1] (ExampleSource)
          +- Learner[0] (DecisionTree)[/tt]

I am only trying a simple decision tree, and I have checked for missing variables at the end of rows (none), made sure that everything is typed properly, I have tried both comma and tab delimited files for input and get the same .dat output -- as you can see I am grasping at straw here.  So two questions:  1. Is there a simple explanation for what throws this error, and 2. where is that log file that it is talking about?  I have looked at the object info and the message window with is where I copied the message from in the first place.

Thanks much,

--chris
8 REPLIES

Re: StringIndexOutOfBoundsException ?


Does the error still occur if your input csv contains only the header and the 1st line of the data?

What happens if you set a breakpoint after the 1st operator?

What happens if you remove the 2nd operator (the learner) completely?

Regular Contributor

Re: StringIndexOutOfBoundsException ?

1.  With just the first line of data (no missing data) and the headers it read with no problem and ran all of the way through.

2.  Adding after the example reader with the full data set threw the same error.

3.  Removing the learner (so the only step was the example reader) threw the same error.

Thanks,

--chris
Regular Contributor

Re: StringIndexOutOfBoundsException ?

I also saved the data in Excel format and then used the Excel example set reader.  I got the following error:

[tt]P Jul 30, 2008 1:49:38 PM: [Fatal] IndexOutOfBoundsException occured in 1st application of ExcelExampleSource (ExcelExampleSource)
P Jul 30, 2008 1:49:38 PM: [Fatal] Process failed: operator cannot be executed. Check the log messages...
          Root[1] (Process)
here ==> +- ExcelExampleSource[1] (ExcelExampleSource)
          +- Learner[0] (DecisionTree)[/tt]

The popup box also said "index 1 size 1"

Re: StringIndexOutOfBoundsException ?

Ok, so the issue is caused by some line (and not the 1st one) in the input file.
It seems your options are:

1. Keep splitting the input file until you find which line causes the behavior.

2. Run the RM in debugger and find that out.

3. Ask the development team is there any way to make ExampSource print on which line it stopped?
Disclaimer: I'm not a member of the RM development team.

RMStaff

Re: StringIndexOutOfBoundsException ?

Hi,



3. Ask the development team is there any way to make ExampSource print on which line it stopped?


Actually, we already place at least a message in the message viewer at the bottom (maybe you have to scroll up a bit?) saying things like:

"G Jul 31, 2008 6:53:01 PM: [Error] Data format error in line 6: the line does not provide the expected number of columns (was: 59, expected: 61)! Stop reading..."

if there is something going wrong (here, for example, in line 6). So maybe you just missed this message.


On the other your data file actually simply contains something we did not tried ourself. In the latter case, such a message is probably not provided and we would like to ask if you could provide as a sample of the data (if it is not too sensible) so that we can try to improve the error messages.

Cheers,
Ingo
Regular Contributor

Re: StringIndexOutOfBoundsException ?

Thanks, I have seen the columnar type message before, what confuses me is that I am getting the generic java error for no reason I can see.

Please send a note to ckolar at imsa dot edu and I will send data that created this situation.  Thanks much,  --chris
RMStaff

Re: StringIndexOutOfBoundsException ?

I just dropped you a note.

Cheers,
Ingo

Re: StringIndexOutOfBoundsException ?

Ingo,

can you also make the ExampleSourse more verbose in case of an error?

"Can't parse line # <line number> : <text of the line>
Last successfully parsed line (that is, the prev line in the file) <text>"