RapidMiner

RapidMiner

Parsing JSON with an illegal unquoted character

SOLVED
Regular Contributor

Parsing JSON with an illegal unquoted character

Hi,

 

I'm using the JSON to Data operator to parse JSON from a web service call.  Some of the data returned has single quotes in it ("We've noticed...").  This is causing the error:

de.rapidanalytics.ejb.service.ServiceDataSourceException Error executing process /myserver/surveys/retrieve_surveys for service retrieve_surveys: The input JSON document is malformed: 'Illegal unquoted character ((CTRL-CHAR, code 10)): has to be escaped using backslash to be included in string value at [Source: {"result":

I'm not able to change the data from the web service.  Is there a way I can fix it after I retrieve it but before I pass it to be parsed?  I'm retrieving it with an Execute Process operator with cURL and then a Read Document.  

 

Thank you,

Rachel

 

 

4 REPLIES
Regular Contributor

Re: Parsing JSON with an illegal unquoted character

I'm fine with just removing the single quotes for now, if that makes the solution easier

Community Manager

Re: Parsing JSON with an illegal unquoted character

[ Edited ]

 Does toggling on "skip invalid documents" not possible for this particular process?

Regards,
T-Bone
Twitter: @neuralmarket
Regular Contributor

Re: Parsing JSON with an illegal unquoted character

I still want the contents of the document, I just need to fix it first.

Highlighted
Elite

Re: Parsing JSON with an illegal unquoted character

[ Edited ]

What if you add a [replace token] operator in front and replace every non escaped single quote with an escaped one?

Something like this ?

<?xml version="1.0" encoding="UTF-8"?><process version="7.3.000">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.3.000" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="text:replace_tokens" compatibility="7.3.000" expanded="true" height="68" name="Replace Tokens" width="90" x="45" y="34">
        <list key="replace_dictionary">
          <parameter key="([^\\])'" value="$1\\'"/>
        </list>
      </operator>
      <operator activated="true" class="text:json_to_data" compatibility="7.3.000" expanded="true" height="82" name="JSON To Data" width="90" x="179" y="34"/>
      <connect from_port="input 1" to_op="Replace Tokens" to_port="document"/>
      <connect from_op="Replace Tokens" from_port="document" to_op="JSON To Data" to_port="documents 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="source_input 2" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
    </process>
  </operator>
</process>

 

This replaces "We've noticed..." with "We\'ve noticed..."