"Failing to load Excel data using ExcelExampleSource"

amogha_ecamogha_ec Member Posts: 11 Contributor II
edited May 2019 in Help
Hello all,
             Am trying to load Excel data and to create Example set objec but ending up with  exception.I am confused on how to use ExcelExampleSource.Am confused on how to configure xml file and some concepts like .aml format.Both my xml file and .xls file reside in c drive Excel folder
name of xls file is AirTickets.xls which contains some 35000 entries..I am executing the following code..
                       RapidMiner.init();
try
{
Operator operator=OperatorService.createOperator(ExcelExampleSource.class);
operator.setParameter("attributes", "C:\\Excel\\excelexamplesource.xml");
OperatorDescription operatordis = operator.getOperatorDescription();
ExcelExampleSource excelexamplesource=new ExcelExampleSource(operatordis);
System.out.println("excelexamplesource="+excelexamplesource);
ExampleSet exampleSet = excelexamplesource.createExampleSet();
Iterator<Example> iterator= exampleSet.iterator();
Example example=iterator.next();

}
catch (OperatorCreationException e)
{
e.printStackTrace();
}

and my excelexamplesource.xml file looks like:

<?xml version="1.0" encoding="UTF-8"?>
<operator name="excelexamplesource" class="com.rapidminer.operator.nio.ExcelExampleSource" >
<parameter key="attributes" value="C:\Excel\Air_Tickets.xls"/>
</operator>

The Exception is:

" java.lang.NullPointerException
     java.io.FileInputStream.<init>(FileInputStream.java:103)"

Anyone please help..Explain exactly how to load Excel data..
           
Tagged:

Answers

  • Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,993 RM Engineering
    Hi,

    the easiest way is probably to create the RapidMiner process, configure all parameters in RapidMiner, load the process file (xyz.rmp) via
    String xml = com.rapidminer.tools.Tools.readTextFile(is);
    Process process = new Process(xml);
    and then execute it, and store the result:
    IOContainer ioResult = process.run();
    if (ioResult.getElementAt(0) instanceof ExampleSet) {
    resultSet = (ExampleSet)ioResult.getElementAt(0);
    }
    now you have the example set build from your excel file.

    Regards,
    Marco
  • amogha_ecamogha_ec Member Posts: 11 Contributor II
    Hello Marco,
                Thanks for your kind  reply..I tried your example and now am getting some IndexOutofBound Exception.Here is the code:

    String fileName="C:\\Excel\\excelexamplesource.xml";
    File file=new File(fileName);
    String xml = com.rapidminer.tools.Tools.readTextFile(file);
    Process process = new Process(xml);
    IOContainer ioResult = process.run();
    if (ioResult.getElementAt(0) instanceof ExampleSet)
                                    {
    ExampleSet resultSet = (ExampleSet)ioResult.getElementAt(0);

    }

    My excelexamplesource.xml file looks like:

    <?xml version="1.0" encoding="UTF-8"?>
    <operator name="ExcelExampleSource" class="Process">
    <parameter key="excel_file" value="C:\\Excel\\Employee1.xls"/>
    </operator>

    Here Employee1.xls is having only 10 entries. and the Exception is:

    java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
          java.util.ArrayList.RangeCheck(ArrayList.java:547)
          java.util.ArrayList.get(ArrayList.java:322)
          com.rapidminer.operator.IOContainer.getElementAt(IOContainer.java:112)
          ExcelLoader.process(ExcelLoader.java:97)
          ExcelLoader.doGet(ExcelLoader.java:41)
          javax.servlet.http.HttpServlet.service(HttpServlet.java:627)
          javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
          org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:269)
          org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
          org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
          org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
          org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
          org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
          org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
          org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)
          org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:843)
          org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:679)
          org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1293)
          java.lang.Thread.run(Thread.java:619)

    Please help...
  • Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,993 RM Engineering
    Hi,

    you need to have a valid process xml. Your xml consists of only an operator, which is by no means a valid process xml file.
    Just create a RapidMiner process, save it to your repository, and then take the saved *.rmp file and load it again.
    But for your purpose, you may use this as your process xml:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.1.008">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.1.008" expanded="true" name="Process">
        <process expanded="true" height="100" width="212">
          <operator activated="true" class="read_excel" compatibility="5.1.008" expanded="true" height="60" name="Read Excel" width="90" x="45" y="30">
            <parameter key="excel_file" value="C:\Excel\Employee1.xls"/>
            <list key="annotations"/>
            <list key="data_set_meta_data_information"/>
          </operator>
          <connect from_op="Read Excel" from_port="output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    Oh, and please note that in java filepaths should look something like this: C:\Excel\excel.xls, otherwise java will not find them.

    Regards,
    Marco
  • amogha_ecamogha_ec Member Posts: 11 Contributor II
    Hai Marco,
                   I copied your xml file and tried out,but again am getting exception:The exception is occuring at line: IOContainer ioResult = process.run(); and here goes the Exception:


    2011-05-31 09:25:09 INFO: No filename given for result file, using stdout for logging results! (WrapperLoggingHandler.log())
    2011-05-31 09:25:09 INFO: Process starts (Process.run())
    2011-05-31 09:25:11 SEVERE: Servlet.service() for servlet ExcelLoader threw exception (StandardWrapperValve.invoke())

    " java.lang.ArrayIndexOutOfBoundsException: -1"

         jxl.read.biff.SheetImpl.getCell(SheetImpl.java:325)
         com.rapidminer.operator.nio.model.ExcelResultSet.<init>(ExcelResultSet.java:120)
         com.rapidminer.operator.nio.model.ExcelResultSetConfiguration.makeDataResultSet(ExcelResultSetConfiguration.java:184)
         com.rapidminer.operator.nio.model.AbstractDataResultSetReader.createExampleSet(AbstractDataResultSetReader.java:103)
         com.rapidminer.operator.io.AbstractExampleSource.read(AbstractExampleSource.java:52)
         com.rapidminer.operator.io.AbstractExampleSource.read(AbstractExampleSource.java:36)
         com.rapidminer.operator.io.AbstractReader.doWork(AbstractReader.java:123)
         com.rapidminer.operator.Operator.execute(Operator.java:829)
         com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
         com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:709)
         com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:369)
         com.rapidminer.operator.Operator.execute(Operator.java:829)
         com.rapidminer.Process.run(Process.java:901)
         com.rapidminer.Process.run(Process.java:797)
         com.rapidminer.Process.run(Process.java:792)
         com.rapidminer.Process.run(Process.java:782)
         ExcelLoader.process(ExcelLoader.java:96)
         ExcelLoader.doGet(ExcelLoader.java:41)
         javax.servlet.http.HttpServlet.service(HttpServlet.java:627)
         javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
         org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:269)
         org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
         org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
         org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
         org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
         org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
         org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
         org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)
         org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:843)
         org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:679)
         org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1293)
         java.lang.Thread.run(Thread.java:619)
    2011-05-31 09:27:04 INFO: Reloading this Context has started (StandardContext.reload())
    ???

    Where am going wrong??
  • amogha_ecamogha_ec Member Posts: 11 Contributor II
    hello Marco,
                          I didnt get exactly what does this sentence mean  "Just create a RapidMiner process, save it to your repository, and then take the saved *.rmp file and load it again." Please make clear..I am new to Rapidminer...
  • Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,993 RM Engineering
    Hi,

    this exception indicates that your excel file cannot be read for some reason, probably because the parameters of the Read Excel operator do not work for your excel file.

    What I meant by my previous suggestion was: Open RapidMiner, click new. Then you choose the Read Excel operator, and drag&drop it in the main process window. Then connect the output port of this operator to the result port on the right side of the process view. Now you can click on the Read Excel operator, and specifiy its parameters on the right side of RapidMiner. Now you can run the process, and RapidMiner will create and show your result. Alternatively you can copy&paste my previously posted process xml into the xml tab and click the green check. If you are still unsure what to do exactly, please consult the manual pdf.
    Once you get the process to run without errors (you will need to adjust some parameters of the Read Excel operator) in RapidMiner, you can use the xml to create the process programmatically as shown above.

    Regards,
    Marco
  • amogha_ecamogha_ec Member Posts: 11 Contributor II
    Hello Marco,
                        Thanks for your support.Now am able to get ExampleSet out of Excel Sheet and able to iterate through ExampleSet without any Exception.But now i have stuck up with some strange result.There is column by name "Client" in my Excel Sheet.Am trying to access the particular cell value through this attribute Client.Here for upto some 100 rows am getting the valid cell value,but afterwards am getting invalid values '?'.Why its happening so.the code using to access cell value is:

    String clientname=(String)example.get("Client");

    Thanks in advance...
  • Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,993 RM Engineering
    Hi,

    this would be the code to extract the information.
    Attribute att = exampleSet.getAttributes().get("Client");
    for (Example example : exampleSet) {
          String value = example.getValueAsString(att); // if att is nominal, otherwise: Double value = example.getValue(att);
    }
    Is your RapidMiner process working, i.e. showing the correct result table with all the cell values? If not, you will need to fix your RapidMiner process.

    Regards,
    Marco
  • amogha_ecamogha_ec Member Posts: 11 Contributor II
    Hi Marco,

                     I tried your code...Still am gettin '?' values.There is another column by name "Airline".When i access that column values am getting all 35593 values correctly with the same code tat i was using earlier and with the one u sent.But the problem is with this "Client" column.I have less idea about this process configuration.

    And one more thing is  i have searched the API  doc and i didnt get exactly how to draw bar chart using barchart plotter class.Can you please send me  a code snippet to draw BarChart.

    I am mining some around 35000 rows from excel sheet and i  have to draw a chart using this data . I want to  know wheather the procedure am following is the best way to do and ll it works in  real time without any problem or is there any other way?

    Thanks in advance :)
  • Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,993 RM Engineering
    Hi,

    Make sure the process works correctly in RapidMiner, then it will also work if executed via the previously mentioned code. If you don't know how to work with RapidMiner, I suggest reading the manual or going to Youtube and search for some tutorials.

    Regards,
    Marco
  • amogha_ecamogha_ec Member Posts: 11 Contributor II
    Hi Marco,

    I have considered your suggestion.. ll look into that..thanks for your support.. :)
Sign In or Register to comment.