Problem while uploading files (with special characters)

ralph-wgtralph-wgt Member Posts: 6 Contributor II
edited November 2018 in Help
Hello there,

we are facing various problems with the RapidAnalytics. I would like to explain those problems here while we hope, that anyone may have experienced the same issues and knows how to solve them.

To begin with, I tried to upload a csv-File with special characters (swedish, which should be possible with UTF-8, as well as ISO-8859-1 and Windows-1552). Well, in RapidMiner there is no problem with this.
But when copy-paste the imported file from the local to the RapidAnalytics-Repository (home-directory), the upload goes well in the first place. But when I try to open the file (for example double-click) from the Analytics-Repository, I get:

"Message: Cannot parse I/O-Object: com.rapidminer.repository.RepositoryException: Cannot download IOObject: 500: Internal Server Error"

The server at this time states the following:
"13:52:14,438 INFO  [de.rapidanalytics.ejb.RepositoryStorageEJBImpl] admin submitted new object to /home/hoepken/data/OverallBookings_Ausschnitt1.
13:52:14,463 INFO  [de.rapidanalytics.ejb.RepositoryEJBImpl] Creating recursively: home/hoepken/data
13:52:14,467 INFO  [de.rapidanalytics.ejb.RepositoryEJBImpl] admin created entry '/home/hoepken/data/OverallBookings_Ausschnitt1 of type 'data'.
13:52:14,482 INFO  [com.rapidminer.example.db.ExampleSetToDB] Dropping data tables for es_65
13:52:14,504 INFO  [com.rapidminer.example.db.ExampleSetToDB] Cannot determine number of subtables. Probably tables already deleted: java.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist

13:52:14,512 INFO  [com.rapidminer.example.db.ExampleSetToDB] Failed to drop table: java.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist
. (Ignoring)
13:52:14,520 INFO  [com.rapidminer.example.db.ExampleSetToDB] Failed to drop table: java.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist
. (Ignoring)
13:52:14,528 INFO  [com.rapidminer.example.db.ExampleSetToDB] Failed to drop table: java.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist
. (Ignoring)
13:52:14,865 WARNING [de.rapidanalytics.entity.IOObjectVersion] Cannot store example set: java.sql.SQLException: ORA-12899: value too large for column "RAPIDANALYTICS"."es_65_meta"."attributeName" (actual: 22, maximum: 21)
: java.sql.SQLException: ORA-12899: value too large for column "RAPIDANALYTICS"."es_65_meta"."attributeName" (actual: 22, maximum: 21)"


---

The first Infos ("failed to drop table") I get with every Database (in the past days I tried MySQL, PostgreSQL and Oracle as well).


Obviously, the problem is not the file size, as I also tried to cut the csv-file to only some rows, so that the file is only 72kb.


I would be very very glad if anyone is able to help. If you need any more information or log files, please let me know. I will send them immediately.
Thanks in advance!

Ralph-J. Andris
University of Applied Sciences Ravensburg-Weingarten
Tagged:

Answers

  • ralph-wgtralph-wgt Member Posts: 6 Contributor II
    More information to the system:

    OS: Windows Server 2008 64-bit
    RA-Version: 1.1 Bundle (automatic installer)
    Database: Oracle 11g Express Edition (also tried by MySQL and Postgre)

    Fresh installation of RA. No problems discovered during installation.
    We piped the StdOut to a textfile and went through it, finding no obvious problems (except for this "duplicate bug" :-))
  • fischerfischer Member Posts: 439 Maven
    Hi,

    something does not fit here. The error message is a bout storing and you are talking about downloading.

    Did you configure your database to use UTF8 in all places? Database, connection, etc? You can edit server/default/deploy/rapidanalytics-ds,xml to add necessary parametrs to the connection URL if that is required.

    Best,
    Simon
  • ralph-wgtralph-wgt Member Posts: 6 Contributor II
    Hello Mr. Fischer,

    thanks for your reply.
    I still think the problem is the uploading of the file, not the downloading. I just used this example to help on localized the problem.

    If I uploaded the file, I do not even get the metadata for the uploaded object, nor can I download it.
    The error messages that I posted already occur on uploading the file.

    The database is configured to UTF8 and the connection works well.

    Thanks for your time!

    Best,
    Ralph
  • fischerfischer Member Posts: 439 Maven
    Hi,

    when you get a 500, there must be an exception in the logs. Can you identify that in the server.log?

    Best,
    Simon
  • user194372user194372 Member Posts: 14 Contributor II

    Hello

     

    I'm getting the same error.

     

    server.log shows:

    Caused by: java.sql.SQLException: ORA-12899: value too large for column "RAPIDM7"."es_46939_meta"."attributeName" (actual: 16, maximum: 8)

     

    RM Studio error message is:

    Cannot save repository data

    Cannot store data in repository at entry '//Raptor.98/home/zzzzz'

    Reason: Cannot upload object. 500 internal Server Error

     

    We are using RM Studio 7.2 64bit and RM Server 7.2 64bit Linux 

    Oracle version is 11.2.0.1

     

    Oracle Language Parameters are: (as in sys.pros$ table)

    (1)NCHAR Character set - NLS_NCHAR_CHARACTERSET=UTF8

    (2)Character set - NLS_CHARACTERSET=UTF8

    (3)Language - NLS_LANGUAGE=KOREAN_KOREA.UTF8 

     

    Operators used in RM Studio:

    (1) Read Excel

    (2) Store (to server repository)

     

    My personal analysis:

    This error occurs when Korean Letters are in the excel header row (first row).

    Korean letters in following rows (after the first row) have no problem.

    Only the korean letters in the first row generates errors.

    The excel table is supposed to be saved in tables such as "es_47542_meta", "es_47542_nominal_mapping", "es_47542_data_1".

    The table "es_47542_meta" has column which name is attributeName.

    This attributeName column does not work. It should contain the header letters

    and when there is any Korean character in this header row, 

    necessary calculation to measure the length of this arributeName columen does not work.

    We are using UTF-8, so every korean character needs 3 bytes.

    If I insert a dummy (fictitous) header in any of the first header row which is large enough to hold the longest korean header,

    then this process works fine.

    For example, if the longest korean header has three korean letters, then

    it requires nine bytes to hold it (3X3), then if there is any english header larger than 9 characters,

    it generates the rapidminer table column attributeName as large as nine bytes and this process works fine.

    It seems like the system (either RM or oracle) does not count the column width of Korean characters 

    represented in UTF-8 character set.

     

     

    test1.jpg

     

    I used ojdbc7-12.1.0.jar for the jdbc driver of the RM server.

    If I change it to ojdbc6.jar, it shows the same error.

     

     

     

     

     

     

  • user194372user194372 Member Posts: 14 Contributor II

     

    This problem is not resolved yet.

     

    Followings are what I tried to fix this, but it did not work. It shows the same error message.

     

    1. standalone.conf

     

    JAVA_OPTS="$JAVA_OPTS -Dfile.encoding=UTF-8"
    JAVA_OPTS="$JAVA_OPTS -DNLS_LANG=KOREAN_KOREA.UTF8"
    JAVA_OPTS="$JAVA_OPTS -Dsun.jnu.encoding=utf-8"
    JAVA_OPTS="$JAVA_OPTS -Djavax.servlet.request.encoding=UTF8"
    JAVA_OPTS="$JAVA_OPTS -Dorg.apache.catalina.connector.URI_ENCODING=UTF-8"

     

    2. standalone.xml

     

    <system-properties>
    <property name="org.apache.catalina.connector.URI_ENCODING" value="UTF-8"/>
    <property name="file.encoding" value="UTF-8"/>
    <property name="org.apache.catalina.connector.USE_BODY_ENCODING_FOR_QUERY_STRING" value="true"/>
    <property name="sun.jnu.encoding" value="UTF-8"/>
    </system-properties>

     

    <datasource jta="true" jndi-name="java:/jdbc/RapidAnalyticsDS" . . . . . . . . . . . . . . 

        <connection-property name="char.encoding">UTF-8</connection-property>
        <connection-property name="useUnicode">yes</connection-property>
        <connection-property name="connectionCollation">utf8_bin</connection-property>
        <connection-property name="characterSetResults">UTF-8</connection-property>

     

     

    3. server.log shows following (entries related to UTF-8 character encoding)

            NLS_LANG = KOREAN_KOREA.UTF8

            [Standalone] =

            java.version = 1.8.0_101

            java.vm.name = Java HotSpot(TM) 64-Bit Server VM

            os.name = Linux

            os.version = 2.6.32-504.el6.x86_64

            file.encoding = UTF-8

            sun.io.unicode.encoding = UnicodeLittle

            sun.jnu.encoding = UTF-8

            user.language = ko

            javax.servlet.request.encoding = UTF-8

            org.apache.catalina.connector.URI_ENCODING = UTF-8

            user.timezone = ROK

     

    Anybody who can help us. Please.

     

  • user194372user194372 Member Posts: 14 Contributor II

     

    Server.log shows following Errors:

     

    1. case one

    14:05:17,936 INFO [de.rapidanalytics.ejb.RepositoryStorageEJBImpl] (http-/0.0.0.0:8080-1) admin submitted new object to /hangul1/kortest1.
    14:05:18,061 INFO [com.rapidminer.server.example.db.ExampleSetToDB] (http-/0.0.0.0:8080-1) Dropping data tables for es_123
    14:05:18,468 INFO [com.rapidminer.server.example.db.ExampleSetToDB] (http-/0.0.0.0:8080-1) Failed to drop table: java.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist
    . (Ignoring)
    14:05:18,499 INFO [com.rapidminer.server.example.db.ExampleSetToDB] (http-/0.0.0.0:8080-1) Failed to drop table: java.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist
    . (Ignoring)
    14:05:18,733 WARNING [de.rapidanalytics.entity.IOObjectVersion] (http-/0.0.0.0:8080-1) Cannot store example set: java.sql.SQLException: ORA-12899: value too large for column "RAPID"."es_123_meta"."attributeName" (actual: 6, maximum: 5)
    : java.sql.SQLException: ORA-12899: value too large for column "RAPID"."es_123_meta"."attributeName" (actual: 6, maximum: 5)

     

     

     

    2. case two:

    13:09:57,120 INFO [de.rapidanalytics.ejb.RepositoryStorageEJBImpl] (http-/0.0.0.0:8080-2) admin submitted new object to /hangul1/kortest101.
    13:09:57,143 INFO [de.rapidanalytics.ejb.RepositoryEJBImpl] (http-/0.0.0.0:8080-2) admin created entry '/hangul1/kortest101 of type 'data'.
    13:09:57,194 INFO [com.rapidminer.server.example.db.ExampleSetToDB] (http-/0.0.0.0:8080-2) Dropping data tables for es_643
    13:09:57,228 INFO [com.rapidminer.server.example.db.ExampleSetToDB] (http-/0.0.0.0:8080-2) Cannot determine number of subtables. Probably tables already deleted: java.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist

    13:09:57,254 INFO [com.rapidminer.server.example.db.ExampleSetToDB] (http-/0.0.0.0:8080-2) Failed to drop table: java.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist
    . (Ignoring)
    13:09:57,257 INFO [com.rapidminer.server.example.db.ExampleSetToDB] (http-/0.0.0.0:8080-2) Failed to drop table: java.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist
    . (Ignoring)
    13:09:57,260 INFO [com.rapidminer.server.example.db.ExampleSetToDB] (http-/0.0.0.0:8080-2) Failed to drop table: java.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist
    . (Ignoring)
    13:09:57,385 WARNING [de.rapidanalytics.entity.IOObjectVersion] (http-/0.0.0.0:8080-2) Cannot store example set: java.sql.SQLException: ORA-12899: value too large for column "RAPM"."es_643_meta"."attributeName" (actual: 6, maximum: 5)
    : java.sql.SQLException: ORA-12899: value too large for column "RAPM"."es_643_meta"."attributeName" (actual: 6, maximum: 5)

    at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:450) [ojdbc7-12.1.0.2.jar:12.1.0.2.0]
    at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:399) [ojdbc7-12.1.0.2.jar:12.1.0.2.0]
    at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:1017) [ojdbc7-12.1.0.2.jar:12.1.0.2.0]
    at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:655) [ojdbc7-12.1.0.2.jar:12.1.0.2.0]

  • user194372user194372 Member Posts: 14 Contributor II

     

    Last time I used Rapidminer Server 7.2 on Linux using Operation Data in Oracle 12.1.0.2.0 with Character set UTF8.

     

    I did the same test using a Rapidminer Server 7.2 on Windows 2012 with Oracle 12.1.0 with database created with Character set AL32UTF8.

    It shows the same error.

    This time I did some network TCP sniff to trace the actual data sent from studio to server.

    It shows POST was used and Korean letters were encoded in UTF-8.

     

    << HTTP Packet sent from studio to Server >>

    POST /api/rest/resources/hangul1/kortest1 HTTP/1.1
    Accept-Charset: UTF-8
    Content-Type: application/vnd.rapidminer.ioo
    User-Agent: Java/1.8.0_51
    Host: 192.168.80.202:8080
    Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
    Connection: keep-alive
    Authorization: Basic YWRtaW46Y2hhbmdlaXQ=
    Content-Length: 569

    *q.................aaabb........polynominal....single_value......................... ........

     

     

    The last part include korean letters encoded in UTF-8 korean letters composed of 3 bytes.

     

    << Binary data Analysis >>

    00000180 75 65 00 00 00 05 00 00 00 00 00 00 00 06 EC 82 ue...... ........
    00000190 AC EA B3 BC 00 00 00 01 00 00 00 09 ED 96 84 EB ........ ........
    000001A0 B2 84 EA B1 B0 00 00 00 02 00 00 00 0C ED 8C 8C ........ ........
    000001B0 EC 9D B8 EC 95 A0 ED 94 8C 00 00 00 03 00 00 00 ........ ........
    000001C0 0F EC 98 A4 EC A7 95 EC 96 B4 EC A7 AC EB BD 95 ........ ........

     

    EC 82 AC (3bytes) is equivalent to Korean letter '사'

    EA B3 BC (3bytes) is equivalent to Korean letter '과'

    ED 96 84 (3bytes) is equivalent to Korean letter '햄'

    EB B2 84 (3bytes) is equivalent to Korean letter '버'

     

  • kaymankayman Member Posts: 662 Unicorn

    Since you state the problem is in the header (first row) only, could it be related to your database (the one you use for the server) also? Most do not like 'strange characters' for column names, so what if you use more 'friendly' row headers? 

    I had something similar in the past, where my row header was using 'illegal' characters according to my SQL database, but where it was perfectly OK to store them in the actual cells. I had to use a more friendly header and it was solved at that moment.

Sign In or Register to comment.