Google Cloud Speech API

SHSguySHSguy Member Posts: 24 Contributor I
edited December 2018 in Help

Hi, 

 

Was wondering if anyone knew the process required to get an output from the Google Cloud Speech API? 

Yes, before anyone asks I did get the required Google API key which has been omitted for the obvious reasons :) 

From the tutorial, it indicates that you need the Google Speech operator which then should provide an example output (Data Table).

It runs, the operator receives a green check mark but produces no output.  

 

Tried:

Various audio input formats (mp3,mp4,flac)

Various audio output formats (mp3,mp4,flac)

 

Appreciate any assistance on the matter. 

 

<macros/>
</context>
<operator activated="true" class="process" compatibility="8.2.000" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="google_cloud_platform:cloud_speech" compatibility="1.1.000" expanded="true" height="68" name="Google Cloud Speech API" width="90" x="313" y="238">
<parameter key="API key" value="OMITTED"/>
<parameter key="audio file" value="/Users/Nick/Desktop/Brad_interview_20181605.flac"/>
<parameter key="sample rate (Hz)" value="18000"/>
<parameter key="language code" value="en-US"/>
</operator>
<connect from_op="Google Cloud Speech API" from_port="example set" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>

 

Screen Shot 2018-05-17 at 1.43.43 pm.png

Answers

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member Posts: 1,967  Community Manager

    hi @SHSguy so the Cloud Speech API operator should return an output if you have a valid audio file and parameters. Try this with your API key:

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.2.000">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.2.000" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="api_toolbox:cloud_speech" compatibility="1.1.000" expanded="true" height="68" name="Google Cloud Speech API" width="90" x="45" y="34">
    <parameter key="API key" value=""/>
    <parameter key="request type" value="remote"/>
    <parameter key="Cloud Storage Bucket URI" value="gs://speech-demo/shwazil_hoful.flac"/>
    <parameter key="sample rate (Hz)" value="16000"/>
    <parameter key="language code" value="en-US"/>
    </operator>
    <connect from_op="Google Cloud Speech API" from_port="example set" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

    You should get this:

     

    Screen Shot 2018-05-21 at 11.40.25 AM.png

     

    Scott

  • SHSguySHSguy Member Posts: 24 Contributor I

    Hi Scott, 

    Thank you for the reply. 

    I have tried that process, only replacing the Google Speech API key, and it seems to do something. However, instead of a result it changes the XML code you provided to: 

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.2.000">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.2.000" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="dummy" compatibility="8.2.000" expanded="true" height="68" name="Google Cloud Speech API" width="90" x="45" y="34"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    </process>
    </operator>
    </process>

    While the Process page adds a red process box, title Google Cloud Speech (please see below). It then indicates that I should download an extension from the market. It opens the market and then returns with an error stating the extension is not available. 

     

    Cheers,

    Nicolas 

     


    Capture.JPG

  • mschmitzmschmitz Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 1,872  RM Data Scientist

    Hi @SHSguy,

    were did you get the operator from in first place?

     

    We are at the moment in the process of adding capabilities to an extension to connect to cloud services. Scott used an unreleased beta of this, which gives the option to connect to google Speech. Since you do not have this, you get the red operators.

     

    Best,

    Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • SHSguySHSguy Member Posts: 24 Contributor I

    Hi Martin, 

    It was the in Market Space, unfortunately, my incredibly limited coding expertise (hacking) did not lead me to find it on the deep servers of RapidMiner :) 

     

    Cheers,

     

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member Posts: 1,967  Community Manager

    yes that is indeed on the marketplace (see https://marketplace.rapidminer.com/UpdateServer/faces/product_details.xhtml?productId=rmx_google_cloud_platform - been released since last fall). 

     

    Scott

     

  • bkingbking Member Posts: 4 Contributor I

    Hi Scott...

    I am receiving a similar "non" result in that the google cloud speech api operator appears to run (no complaints) but does not return an example. I confirmed that the .wav file has audible speech.

     

    XML Below:

     

    <?xml version="1.0" encoding="UTF-8"?><process version="9.0.002">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="9.0.002" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="google_cloud_platform:cloud_speech" compatibility="1.1.000" expanded="true" height="68" name="Google Cloud Speech API" width="90" x="112" y="34">
    <parameter key="API key" value=""/>
    <parameter key="audio file" value="C:\Users\PR4756082944-20180924-5044029957.wav"/>
    <parameter key="sample rate (Hz)" value="16000"/>
    <parameter key="language code" value="en-US"/>
    </operator>
    <connect from_op="Google Cloud Speech API" from_port="example set" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

     

    Thoughts?...

    sgenzer
  • mschmitzmschmitz Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 1,872  RM Data Scientist

    Hi @bking,

    can you check your rapidminer-studio.log? Sometimes there are errors which are only visible in this file and are not propagated to these "speech bubbles".

     

    BR,

    Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member Posts: 1,967  Community Manager

    hi @bking I'm out of the office most of this week - please ping me again at the end of the week if you're still having trouble.

     

    thx @mschmitz for chiming in.


    Scott

     

  • bkingbking Member Posts: 4 Contributor I

    Hi Martin...

     

    Thank you for replying...below is the output from the log file.

    -------------

    Sep 25, 2018 4:45:41 PM com.rapidminer.Process execute
    INFO: Process //Voice_Analysis_from_TCN_Campaign starts
    Sep 25, 2018 4:45:59 PM com.rapidminer.Process saveResults
    INFO: Saving results.
    Sep 25, 2018 4:45:59 PM com.rapidminer.Process execute
    INFO: Process //Voice_Analysis_from_TCN_Campaign finished successfully after 18 s

    -----------

    Thoughts?

     

    BK

Sign In or Register to comment.