Options

"Unable to write binary data into Postgres"

jonmillardjonmillard Member Posts: 1 Contributor I
edited June 2019 in Help
Hi Folks,

I'm new to RapidMiner, so hope you'll forgive me if I missed something fundamental here.  I have done a search for this problem having been logged before, but nothing came up that was relevant to this issue.

I have written a basic job that:
1.  Reads URLs from a database table in Postgres [Read Database]
2.  Gets the pages into a variable [Get Pages]
3.  Subsets the attribute set [Select Attributes]
4.  Attempts to write the subsetted attributes, including the retrieved page, into another database table in Postgres

The fourth step above returns the following error / stack trace:

++++++++++++++++++++++++++++++++++
Exception: com.rapidminer.operator.UserError
Message: Database error occurred: ERROR: invalid byte sequence for encoding "UTF8": 0x00
Stack trace:

  com.rapidminer.operator.io.DatabaseExampleSetWriter.write(DatabaseExampleSetWriter.java:115)
  com.rapidminer.operator.io.DatabaseExampleSetWriter.write(DatabaseExampleSetWriter.java:66)
  com.rapidminer.operator.io.AbstractWriter.doWork(AbstractWriter.java:69)
  com.rapidminer.operator.Operator.execute(Operator.java:833)
  com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
  com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:709)
  com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:379)
  com.rapidminer.operator.Operator.execute(Operator.java:833)
  com.rapidminer.Process.run(Process.java:925)
  com.rapidminer.Process.run(Process.java:848)
  com.rapidminer.Process.run(Process.java:807)
  com.rapidminer.Process.run(Process.java:802)
  com.rapidminer.Process.run(Process.java:792)
  com.rapidminer.gui.ProcessThread.run(ProcessThread.java:63)

Cause
Exception: org.postgresql.util.PSQLException
Message: ERROR: invalid byte sequence for encoding "UTF8": 0x00
Stack trace:

  org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2103)
  org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1836)
  org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
  org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:512)
  org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
  org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:334)
  com.rapidminer.tools.jdbc.DatabaseHandler.applyBatchInsertIntoTable(DatabaseHandler.java:692)
  com.rapidminer.tools.jdbc.DatabaseHandler.createTable(DatabaseHandler.java:591)
  com.rapidminer.operator.io.DatabaseExampleSetWriter.write(DatabaseExampleSetWriter.java:107)
  com.rapidminer.operator.io.DatabaseExampleSetWriter.write(DatabaseExampleSetWriter.java:66)
  com.rapidminer.operator.io.AbstractWriter.doWork(AbstractWriter.java:69)
  com.rapidminer.operator.Operator.execute(Operator.java:833)
  com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
  com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:709)
  com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:379)
  com.rapidminer.operator.Operator.execute(Operator.java:833)
  com.rapidminer.Process.run(Process.java:925)
  com.rapidminer.Process.run(Process.java:848)
  com.rapidminer.Process.run(Process.java:807)
  com.rapidminer.Process.run(Process.java:802)
  com.rapidminer.Process.run(Process.java:792)
  com.rapidminer.gui.ProcessThread.run(ProcessThread.java:63)
++++++++++++++++++++++++++++++++++

I do understand why this is occurring - the destination column in Postgres is of type 'text' whereas at least one of the 'pages' is actually binary (in this case an XLS file).  When the JDBC driver sees null values in the input, it throws an exception.  I did try setting the destination field to 'bytea', but then either Rapidminer or the JDBC driver (can't recall which sorry) complains that the datatype is not compatible and thows an exception as well.  I would prefer to store the downloaded content exactly as obtained (as a record) and for later use, but I cannot find an operator that will cast a 'text' variable to a byte array or something similar.

Any thoughts on what might be going wrong here or what I ought to try?

Thanks in advance,

Jon
Tagged:
Sign In or Register to comment.