Options

Importing Errors / Socrata

emiliocgemiliocg Member Posts: 5 Contributor I
edited November 2018 in Help

Hello,

I don't have a large expertise on this topic, but I've used some Data Prep tools before. 

There is one open data portal of an agency in Texas provided by a solution called Socrata.

But I'm having errors when importing the tool to Rapid Miner.

I already tried exporting CSV, CSV for Excel, TSV... In all the cases the are many importing errors...

I remember a conversation with a peer that worked on the same file months ago and told me he found, some  "," (not sure if additional and/or missing) on the databases creating discrepancies and columns errors when importing.

Any suggestion about how to solve that?

 

Thanks


 
Tagged:

Answers

  • Options
    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Are you trying to export data from the Socrata system to read into RapidMiner, or do you want to export data from RapidMiner to read into Socrata?

  • Options
    emiliocgemiliocg Member Posts: 5 Contributor I

    @Thomas_Ott wrote:

    Are you trying to export data from the Socrata system to read into RapidMiner, or do you want to export data from RapidMiner to read into Socrata?


    From Socrata to RapidMiner

  • Options
    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    So I'm not familiar with Socata and AFAIK, there is no direct connector from RM to Socrata. 

     

    If you can export the data as a CSV file, then you could use a Read CSV file to import the data in RapidMiner.  I would use the Import Wizard on the Read CSV file to drill down and configure the import. Sometimes CSV files can contain extra information or notes than can cause the initial import to fail.

     

    That said, the Read CSV is like a swiss army knife for loading in flat files. I can do some really cools imports but you got to take the time to tune it. 

  • Options
    IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder

    Just a quick addition: if you can share some of the CSV data you have exported we can have a look at it and see what the problems are (and hopefully can help you to find the correct settings).

     

    Thanks,

    Ingo

  • Options
    emiliocgemiliocg Member Posts: 5 Contributor I

    @IngoRM wrote:

    Just a quick addition: if you can share some of the CSV data you have exported we can have a look at it and see what the problems are (and hopefully can help you to find the correct settings).

     

    Thanks,

    Ingo


    Sure Ingo, how I can share it with you? here? (It's ~2GB)

  • Options
    emiliocgemiliocg Member Posts: 5 Contributor I

    @Thomas_Ott wrote:

    So I'm not familiar with Socata and AFAIK, there is no direct connector from RM to Socrata. 

     

    If you can export the data as a CSV file, then you could use a Read CSV file to import the data in RapidMiner.  I would use the Import Wizard on the Read CSV file to drill down and configure the import. Sometimes CSV files can contain extra information or notes than can cause the initial import to fail.

     

    That said, the Read CSV is like a swiss army knife for loading in flat files. I can do some really cools imports but you got to take the time to tune it. 


    Hi, T-Bone. Any particular tool that you recommend for Read CSV? (And sorry if the question is too basic, this is first time facing this problem)

  • Options
    IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder

    Probably the first 100 lines or so are sufficient.  You can use a text editor to cut those out - the resulting file should be small enough to attach here or to send one of us as private message if the data is confidential.

     

    Cheers,

    Ingo

  • Options
    emiliocgemiliocg Member Posts: 5 Contributor I

    @IngoRM wrote:

    Probably the first 100 lines or so are sufficient.  You can use a text editor to cut those out - the resulting file should be small enough to attach here or to send one of us as private message if the data is confidential.

     

    Cheers,

    Ingo


    Thanks, I just sent a PM.

  • Options
    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Yes, RapidMiner has a operator called Read CSV. It's located under the Data Acces > Files > Read folder of the Operator tab (typically lower left on RapidMiner Studio).

Sign In or Register to comment.