Multilogit in R - Model Application fails

Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
edited December 2018 in Help

Alright guys, I'm running into an error using R. I'm attaching a sample process, I can't post the actual data but it's similar to the data below.

 

I'm using the Mlogit package in R and I can build a model just fine. When I try to apply the model to new data, I get an application error (see screenshot below). There is a predict function in the package but I'm wondering if I need to fit the model first in the Apply stage? I was under impression that I don't need to. 

 

Mlogit.png

 

 

 

 

<?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="8.1.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="generate_data" compatibility="8.1.001" expanded="true" height="68" name="Generate Data" width="90" x="45" y="34">
<parameter key="target_function" value="random classification"/>
</operator>
<operator activated="true" class="generate_id" compatibility="8.1.001" expanded="true" height="82" name="Generate ID" width="90" x="179" y="34"/>
<operator activated="true" class="split_data" compatibility="8.1.001" expanded="true" height="103" name="Split Data" width="90" x="246" y="187">
<enumeration key="partitions">
<parameter key="ratio" value="0.7"/>
<parameter key="ratio" value="0.3"/>
</enumeration>
</operator>
<operator activated="true" class="r_scripting:execute_r" compatibility="8.1.000" expanded="true" height="82" name="Execute Mlogit Model" width="90" x="447" y="34">
<parameter key="script" value="library(mlogit)&#10;&#10;rm_main = function(data)&#10;{&#10;&#10;&#9;# print the meta data &#10; &#9;print(metaData)&#10; &#10; &#9;# access the meta data for every entry&#10; &#9;for(i in seq(along=metaData$data)) {&#10; &#9;print(paste(&quot;type of&quot;, names(metaData$data)[i], &quot;in the original example set:&quot;, metaData$data[[i]]$type))&#10; &#9;print(paste(&quot;role of&quot;, names(metaData$data)[i], &quot;in the original example set:&quot;, metaData$data[[i]]$role))&#10; }&#10; &#10;&#9;data$id &lt;- as.factor(data$id)&#10;&#9;data$label &lt;- as.logical(data$label)&#10;&#9;&#10;&#9;mModel &lt;- mlogit.data(formula = label, data = data, alt.var=&quot;id&quot;, shape= &quot;long&quot;)&#10;&#9;&#10;&#9;return(mModel)&#10;}&#10;"/>
</operator>
<operator activated="true" class="r_scripting:execute_r" compatibility="8.1.000" expanded="true" height="103" name="Apply Mlogit Model" width="90" x="648" y="187">
<parameter key="script" value="library(mlogit)&#10;&#10;rm_main = function(mModel, data)&#10;&#10;{&#9;&#9;&#10;&#9;# print the meta data &#10; &#9;print(metaData)&#10; &#10; &#9;# access the meta data for every entry&#10; &#9;for(i in seq(along=metaData$data)) {&#10; &#9;print(paste(&quot;type of&quot;, names(metaData$data)[i], &quot;in the original example set:&quot;, metaData$data[[i]]$type))&#10; &#9;print(paste(&quot;role of&quot;, names(metaData$data)[i], &quot;in the original example set:&quot;, metaData$data[[i]]$role))&#10; }&#10; &#10;&#9;data$id &lt;- as.factor(data$id)&#10;&#9;&#10;&#9;newM &lt;- mlogit.data(data, alt.var=&quot;id&quot;, shape =&quot;long&quot;)&#10;&#9;&#10;&#9;result &lt;-predict(mModel,newdata=newM)&#10;&#9;#result &lt;- predict(mModel, data)&#10;&#10;&#9;data$prediction &lt;- result&#10;&#9;&#10;&#9;metaData$data$prediction &lt;&lt;- list(type=&quot;real&quot;, role=&quot;prediction&quot;)&#10;&#9;&#10;&#9;return(data)&#10;&#9;&#10;}"/>
</operator>
<connect from_op="Generate Data" from_port="output" to_op="Generate ID" to_port="example set input"/>
<connect from_op="Generate ID" from_port="example set output" to_op="Split Data" to_port="example set"/>
<connect from_op="Split Data" from_port="partition 1" to_op="Execute Mlogit Model" to_port="input 1"/>
<connect from_op="Split Data" from_port="partition 2" to_op="Apply Mlogit Model" to_port="input 2"/>
<connect from_op="Execute Mlogit Model" from_port="output 1" to_op="Apply Mlogit Model" to_port="input 1"/>
<connect from_op="Apply Mlogit Model" from_port="output 1" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>

Thoughts on this? Maybe R experts @yyhuang and @Telcontar120 have a clue?

Tagged:

Answers

  • SGolbertSGolbert RapidMiner Certified Analyst, Member Posts: 344 Unicorn

    Hi Tom,

     

    you picked my curiosity and I've read the vignette on the mlogit package:

     

    mlogit deals with both format.  It provides a mlogit.data
    function that take as first argument a data.frame and returns a
    data.frame in “long” format with some information about the structure of the data.

    Therefore mlogit.data returns a data frame, not a model.

     

    Regards,

    Sebastian

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    @SGolbert ah, i see. So when I change mlogit.data to just mlogit I get an "object 'label'" not found. I might have to convert the binominal to an integer. 

Sign In or Register to comment.