StringIndexOutOfBoundsException ?

cgkolarcgkolar Member Posts: 29 Maven
edited November 2018 in Help
Hi everyone.  I have been doing some work with RM over the past few months and I keep running into a problem that I can only make go away by magic.  Using 32-bit RM 4.2 under xp

[tt]
P Jul 29, 2008 3:53:52 PM: [Fatal] StringIndexOutOfBoundsException occured in 1st application of Input (ExampleSource)
P Jul 29, 2008 3:53:52 PM: [Fatal] Process failed: operator cannot be executed. Check the log messages...
          Root[1] (Process)
here ==> +- Input[1] (ExampleSource)
          +- Learner[0] (DecisionTree)[/tt]

I am only trying a simple decision tree, and I have checked for missing variables at the end of rows (none), made sure that everything is typed properly, I have tried both comma and tab delimited files for input and get the same .dat output -- as you can see I am grasping at straw here.  So two questions:  1. Is there a simple explanation for what throws this error, and 2. where is that log file that it is talking about?  I have looked at the object info and the message window with is where I copied the message from in the first place.

Thanks much,

--chris

Answers

  • Legacy UserLegacy User Member Posts: 0 Newbie

    Does the error still occur if your input csv contains only the header and the 1st line of the data?

    What happens if you set a breakpoint after the 1st operator?

    What happens if you remove the 2nd operator (the learner) completely?

  • cgkolarcgkolar Member Posts: 29 Maven
    1.  With just the first line of data (no missing data) and the headers it read with no problem and ran all of the way through.

    2.  Adding after the example reader with the full data set threw the same error.

    3.  Removing the learner (so the only step was the example reader) threw the same error.

    Thanks,

    --chris
  • cgkolarcgkolar Member Posts: 29 Maven
    I also saved the data in Excel format and then used the Excel example set reader.  I got the following error:

    [tt]P Jul 30, 2008 1:49:38 PM: [Fatal] IndexOutOfBoundsException occured in 1st application of ExcelExampleSource (ExcelExampleSource)
    P Jul 30, 2008 1:49:38 PM: [Fatal] Process failed: operator cannot be executed. Check the log messages...
              Root[1] (Process)
    here ==> +- ExcelExampleSource[1] (ExcelExampleSource)
              +- Learner[0] (DecisionTree)[/tt]

    The popup box also said "index 1 size 1"
  • Legacy UserLegacy User Member Posts: 0 Newbie
    Ok, so the issue is caused by some line (and not the 1st one) in the input file.
    It seems your options are:

    1. Keep splitting the input file until you find which line causes the behavior.

    2. Run the RM in debugger and find that out.

    3. Ask the development team is there any way to make ExampSource print on which line it stopped?
    Disclaimer: I'm not a member of the RM development team.

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    3. Ask the development team is there any way to make ExampSource print on which line it stopped?
    Actually, we already place at least a message in the message viewer at the bottom (maybe you have to scroll up a bit?) saying things like:

    "G Jul 31, 2008 6:53:01 PM: [Error] Data format error in line 6: the line does not provide the expected number of columns (was: 59, expected: 61)! Stop reading..."

    if there is something going wrong (here, for example, in line 6). So maybe you just missed this message.


    On the other your data file actually simply contains something we did not tried ourself. In the latter case, such a message is probably not provided and we would like to ask if you could provide as a sample of the data (if it is not too sensible) so that we can try to improve the error messages.

    Cheers,
    Ingo
  • cgkolarcgkolar Member Posts: 29 Maven
    Thanks, I have seen the columnar type message before, what confuses me is that I am getting the generic java error for no reason I can see.

    Please send a note to ckolar at imsa dot edu and I will send data that created this situation.  Thanks much,  --chris
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    I just dropped you a note.

    Cheers,
    Ingo
  • Legacy UserLegacy User Member Posts: 0 Newbie
    Ingo,

    can you also make the ExampleSourse more verbose in case of an error?

    "Can't parse line # <line number> : <text of the line>
    Last successfully parsed line (that is, the prev line in the file) <text>"

Sign In or Register to comment.