Options

Cannot retrieve data with "Enrich Data by Webservice"

rachel_lomaskyrachel_lomasky Member Posts: 52 Guru
edited November 2018 in Help

Hi,

 

I've downloaded the Web Mining extension and would like to use it to connect to a Google-provided webservice.  I've constructed a GET url, and it works fine when I just paste it into a browser (bunch of JSON returned).  However, when I run it with "Enrich Data by Webservice", I get:

Dec 3, 2016 10:31:57 AM SEVERE: Process failed: Cannot retrieve data from the specified URL 'https://www.googleapis.com/analytics/v3/data/ga'.
Dec 3, 2016 10:31:57 AM SEVERE: Here:
Dec 3, 2016 10:31:57 AM SEVERE: Process[1] (Process)
Dec 3, 2016 10:31:57 AM SEVERE: subprocess 'Main Process'
Dec 3, 2016 10:31:57 AM SEVERE: +- Retrieve questions[1] (Retrieve)
Dec 3, 2016 10:31:57 AM SEVERE: ==> +- Enrich Data by Webservice[1] (Enrich Data by Webservice)

Two questions:

1. Why doesn't it work?

2. Is there a way that I can see the query string to do debugging?

 

Thank you,

Rachel

Best Answer

  • Options
    sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager
    Solution Accepted

    here's a sample process (it's using RM 7.3):

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.3.000">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.3.000" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="generate_data_user_specification" compatibility="7.3.000" expanded="true" height="68" name="Generate Data by User Specification" width="90" x="45" y="34">
    <list key="attribute_values">
    <parameter key="foo" value="0"/>
    </list>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="web:enrich_data_by_webservice" compatibility="7.3.000" expanded="true" height="68" name="Enrich Data by Webservice" width="90" x="179" y="34">
    <parameter key="query_type" value="Regular Expression"/>
    <list key="string_machting_queries"/>
    <list key="regular_expression_queries">
    <parameter key="foo2" value=".*"/>
    </list>
    <list key="regular_region_queries"/>
    <list key="xpath_queries"/>
    <list key="namespaces"/>
    <list key="index_queries"/>
    <list key="jsonpath_queries"/>
    <parameter key="url" value="https://www.googleapis.com/analytics/v3/data/ga?ids=ga:XXXXX&amp;amp;start-date=30daysAgo&amp;amp;end-date=yesterday&amp;amp;metrics=ga:sessions&amp;amp;access_token=XXXXXX"/>
    <list key="request_properties"/>
    </operator>
    <connect from_op="Generate Data by User Specification" from_port="output" to_op="Enrich Data by Webservice" to_port="Example Set"/>
    <connect from_op="Enrich Data by Webservice" from_port="ExampleSet" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process> 

    I just tested this with my own Google API account and it works.

     

    Scott 

Answers

  • Options
    sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    hi...I use Google API all the time with this operator and it is quite tricky to get all the settings right. First guess - did you encode your URL?  Can you share your parameter settings (without your key of course)?

    The answer to your second question is no, RM does not give you the same verbose output as you would get with the terminal.  Sometimes when I can't get it right, I do a cURL at the command line, get that to work, and then go back to RM.  

    Scott

  • Options
    rachel_lomaskyrachel_lomasky Member Posts: 52 Guru

    <?xml version="1.0" encoding="UTF-8"?><process version="7.2.003">
    <operator activated="true" class="retrieve" compatibility="7.2.003" expanded="true" height="68" name="Retrieve questions" width="90" x="45" y="85">
    <parameter key="repository_entry" value="../../data/import/questions"/>
    </operator>
    </process>
    <?xml version="1.0" encoding="UTF-8"?><process version="7.2.003">
    <operator activated="true" class="web:enrich_data_by_webservice" compatibility="7.2.001" expanded="true" height="68" name="Enrich Data by Webservice" width="90" x="246" y="85">
    <parameter key="query_type" value="Regular Expression"/>
    <list key="string_machting_queries"/>
    <parameter key="attribute_type" value="Nominal"/>
    <list key="regular_expression_queries"/>
    <list key="regular_region_queries"/>
    <list key="xpath_queries"/>
    <list key="namespaces"/>
    <parameter key="ignore_CDATA" value="true"/>
    <parameter key="assume_html" value="true"/>
    <list key="index_queries"/>
    <list key="jsonpath_queries"/>
    <parameter key="request_method" value="GET"/>
    <parameter key="service_method" value="reportRequests"/>
    <parameter key="url" value="https://www.googleapis.com/analytics/v3/data/ga"/>
    <parameter key="delay" value="0"/>
    <list key="request_properties">
    <parameter key="ids" value="ga:myids"/>
    <parameter key="start-date" value="30daysAgo"/>
    <parameter key="end-date" value="yesterday"/>
    <parameter key="metrics" value="ga:sessions"/>
    <parameter key="access_token" value="my access token"/>
    </list>
    <parameter key="encoding" value="SYSTEM"/>
    </operator>
    </process>

  • Options
    sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    hi ok thanks.  It was hard to figure out that XML (it's from ver 7.2 and there's some strange cut and paste there) but I think I know what you're doing.  I have not used Google Analytics API before but for a GET request, I would first try putting all the parameters in the URL, rather than in "request properties".  Don't ask me why this makes a difference, but in my experience, it does.  Try something like this in the URL:

     

    https://www.googleapis.com/analytics/v3/data/ga?ids=ga%3A<your number here>&start-date=30daysAgo&end-date=yesterday&metrics=ga%3Asessions&access_token=<your access token>

     

    I also don't see anything in your String Matching (called "Machting in the XML!) query so you'll need to tell RapidMiner what you want to do with the response.  I would recommend just doing Regular Expression and using .* for now - just to ensure you're getting a response.

     

    Scott

     

  • Options
    rachel_lomaskyrachel_lomasky Member Posts: 52 Guru

    Thank you, this works.  Now to figure out how to parse the response...

  • Options
    sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    <grin> should not be too bad.  There are a variety of tools to use.  Post if you need more help.

     

    Scott


  • Options
    rachel_lomaskyrachel_lomasky Member Posts: 52 Guru

    It ain't pretty, but I got it working :).

  • Options
    khairulnizamkhairulnizam Member Posts: 1 Contributor I

    Hi, I have the same problem with the "Enrich Data by Webservice". I already tried the parameters using curl.. its work. Here is my process:

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.4.000">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.4.000" expanded="true" name="Process">
    <parameter key="logverbosity" value="init"/>
    <parameter key="random_seed" value="2001"/>
    <parameter key="send_mail" value="never"/>
    <parameter key="notification_email" value=""/>
    <parameter key="process_duration_for_mail" value="30"/>
    <parameter key="encoding" value="SYSTEM"/>
    <process expanded="true">
    <operator activated="true" class="text:create_document" compatibility="7.4.001" expanded="true" height="68" name="Create Document" width="90" x="45" y="136">
    <parameter key="text" value="I love hotdogs. Hotdogs are the greatest. They are hot and delicious."/>
    <parameter key="add label" value="false"/>
    <parameter key="label_type" value="nominal"/>
    </operator>
    <operator activated="true" class="text:documents_to_data" compatibility="7.4.001" expanded="true" height="82" name="Documents to Data" width="90" x="179" y="136">
    <parameter key="text_attribute" value="text"/>
    <parameter key="add_meta_information" value="true"/>
    <parameter key="datamanagement" value="double_sparse_array"/>
    </operator>
    <operator activated="true" class="web:enrich_data_by_webservice" compatibility="7.3.000" expanded="true" height="68" name="Enrich Data by Webservice" width="90" x="313" y="136">
    <parameter key="query_type" value="Regular Expression"/>
    <list key="string_machting_queries"/>
    <parameter key="attribute_type" value="Nominal"/>
    <list key="regular_expression_queries">
    <parameter key="all" value=".*"/>
    </list>
    <list key="regular_region_queries"/>
    <list key="xpath_queries"/>
    <list key="namespaces"/>
    <parameter key="ignore_CDATA" value="true"/>
    <parameter key="assume_html" value="true"/>
    <list key="index_queries"/>
    <list key="jsonpath_queries"/>
    <parameter key="request_method" value="POST"/>
    <parameter key="body" value="text=&lt;%text%&gt;"/>
    <parameter key="url" value="https://twinword-sentiment-analysis.p.mashape.com/analyze/"/>
    <parameter key="delay" value="0"/>
    <list key="request_properties">
    <parameter key="X-Mashape-Key" value="QhBpo6d9YgmsherFsSBVfycN0czjp1rf0HIjsnooes2EdNYmao"/>
    <parameter key="Content-Type" value="application/x-www-form-urlencoded"/>
    <parameter key="Accept" value="application/json"/>
    </list>
    <parameter key="encoding" value="SYSTEM"/>
    </operator>
    <connect from_op="Create Document" from_port="output" to_op="Documents to Data" to_port="documents 1"/>
    <connect from_op="Documents to Data" from_port="example set" to_op="Enrich Data by Webservice" to_port="Example Set"/>
    <connect from_op="Enrich Data by Webservice" from_port="ExampleSet" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

  • Options
    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    I think there's a problem with your API key. I tried your XML code and get a JSON respons that say "

    {"message":"Missing Mashape application key. Go to http:\/\/docs.mashape.com\/api-keys to learn how to get your API application key."}

      

  • Options
    rachel_lomaskyrachel_lomasky Member Posts: 52 Guru

    My problem was that I was quoting parameters. Everything should be non-quoted.

  • Options
    kludikovskykludikovsky Member Posts: 30 Maven
Sign In or Register to comment.