RapidMiner

Parsing JSON with an illegal unquoted character

SOLVED
Regular Contributor

Parsing JSON with an illegal unquoted character

Hi,

 

I'm using the JSON to Data operator to parse JSON from a web service call.  Some of the data returned has single quotes in it ("We've noticed...").  This is causing the error:

de.rapidanalytics.ejb.service.ServiceDataSourceException Error executing process /myserver/surveys/retrieve_surveys for service retrieve_surveys: The input JSON document is malformed: 'Illegal unquoted character ((CTRL-CHAR, code 10)): has to be escaped using backslash to be included in string value at [Source: {"result":

I'm not able to change the data from the web service.  Is there a way I can fix it after I retrieve it but before I pass it to be parsed?  I'm retrieving it with an Execute Process operator with cURL and then a Read Document.  

 

Thank you,

Rachel

 

 

4 REPLIES
Regular Contributor

Re: Parsing JSON with an illegal unquoted character

I'm fine with just removing the single quotes for now, if that makes the solution easier

Community Manager

Re: Parsing JSON with an illegal unquoted character

[ Edited ]

 Does toggling on "skip invalid documents" not possible for this particular process?

Regards,
T-Bone
Twitter: @neuralmarket
Regular Contributor

Re: Parsing JSON with an illegal unquoted character

I still want the contents of the document, I just need to fix it first.

Elite

Re: Parsing JSON with an illegal unquoted character

[ Edited ]

What if you add a [replace token] operator in front and replace every non escaped single quote with an escaped one?

Something like this ?

<?xml version="1.0" encoding="UTF-8"?><process version="7.3.000">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.3.000" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="text:replace_tokens" compatibility="7.3.000" expanded="true" height="68" name="Replace Tokens" width="90" x="45" y="34">
        <list key="replace_dictionary">
          <parameter key="([^\\])'" value="$1\\'"/>
        </list>
      </operator>
      <operator activated="true" class="text:json_to_data" compatibility="7.3.000" expanded="true" height="82" name="JSON To Data" width="90" x="179" y="34"/>
      <connect from_port="input 1" to_op="Replace Tokens" to_port="document"/>
      <connect from_op="Replace Tokens" from_port="document" to_op="JSON To Data" to_port="documents 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="source_input 2" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
    </process>
  </operator>
</process>

 

This replaces "We've noticed..." with "We\'ve noticed..."