Compete in RapidMiner's 3rd Competition: Fantasy Football. Top prize is $750. Deadline December 19.
Download RapidMiner Studio or Server 8.0 Public Beta. Let us know how you like it! Ends November 27.
Watch RapidMiner's "Getting Started" videos on YouTube. Everything you need to do data science - fast and simple!
I am trying to use RapidMiner (7.2.003) to automate some of my more advanced reporting that requires advanced SOQL across multiple objects for accuracy. Specifically, I need to understand the stage of the opportunity as well as the status of the quote, finally aggregating the quote line data to summarize sales performance from quote to invoice conversion.
I can successfully traverse an sObject relationship to generate a table, but if I attempt to do anything (select attributes, set roles, etc.) with fields accessed via an sObject relationship (fields which are not native to the explicitly declared "FROM" object), RapidMiner throws an "Attribute not found" error. Originally, I thought this could have something to do with the use of dot-notation so I attempted to alias out the referenced fields which did not work. I attempted to fix the data using Rename, Rename by Replacing - none of these worked.
I have to use separate Read Salesforce Operators to pull individual field sets from the individual objects and then join them which is not something I'd like to do, if I have N Read Salesforce operators, I would require N-1 Join operators and, at formally least, N data prep operators. Does anyone have a workaround or access to guide materials for using Salesforce SOQL inside of RapidMiner?
Solved! Go to Solution.
Ah yes, the SOQL is a bit fincky at times.
W.R.T to your problem, does RapidMiner just complain or does the process crash when you run it? I've only every declared the FROM statement through the Read Salesforce operator and it properly propogates the metadata through the process. Not sure what is going on here. Maybe @Marco_Boeck might be able to shed some light on this.
If I attempt to use a field reference that is accessed via relationship it will give me a red pop-up with an x in the background that reads:
Attribute not found
The attribute n_Quote__r.n_Opp__r.Id specified in the attribute_nam parameter does not exist in the input data
This is from a "Set Role" operator - although the operator here doesn't matter, the result is basically the same regardless of the operator. Any relationship accessed fields throw this "Attribute not found" issue, sadly. I can use "Select Attributes" tuned to "All", but once I attempt to "Set Role" it throws an "Attribute not found" error. I've wanted to rename it, I suspect some of the formatting (either the "_" or the ".") might be causing an issue, although I'm not sure.
the reason you get this error is (in essence), very simple: The attribute you want is not part of the result after your "Read Salesforce" operator. You can always check what exactly you get by adding a breakpoint after an operator (right-click on it, select "Breaking After" and run the process. It will pause after the operator and display its outputs.
Now for some technical details regarding SOQL: Because we had to make use of their generic SOAP API to have a generic Read operator for everyone (as opposed to the custom-tailored SOAP API you get when working on your very own, private Salesforce instance), there are a few problems, mainly in regards to distinguishing what is an empty attribute vs one which was not even queried. The result we get from the Salesforce API for some queries is (pardon my french) a freaking mess. We are currently working on trying to improve our heuristics in regards to sifting through the extremely weird XML result data we get from Salesforce to improve usability of the operator.
Having said all that, I think in your specific case you are trying to access relationship fields via "object.field" while we (for internal reasons) have to rename the attribute to "object_field". Again, you can check what exactly you get by using breakpoints, but I suspect in your case exchanging "." with "_" might already suffice.
That matches my observations, the attributes I want are not part of the result after the "Read Salesforce" operation, instead the "." is replaced with a "_". I've attempted to work around that by using either Rename or Rename by Replacing, but it doesn't work as intended (at all) because, like you've indicated, the attribute name appears to be different at process runtime, hence RapidMiner throws the 'Attribute not found' error.
The only way I've found to deal with this problem is manually typing in the attribute with the "_" instead of "." wherever it appears. This causes RapidMiner to throw 'Potential problem detect' notices since it's using an attribute that doesn't exist until the process is run.
Is there an operator, piece of code, or something I can do to systematically convert or have RapidMiner recognize the identified attributes (I assume it pulls them from my SOQL Query) after the "." -> "_" conversion?
Please let me know so I can select either the previous post or the answer to that question as the best answer/solution
I'm afraid for now your only hope is to change "." to "_" manually before running the process. We will try to fix the metadata in the next version so both preview metadata as well as actual data should both use "_".