Defining the strength of a transition graph by counting?

eldenosoeldenoso Member Posts: 65 Contributor I
edited November 2018 in Help

Hello everybody,

currently I want to generate a transition graph of people transitioning from one hotel into another year by year. This generation was no problem, but when it comes to the strength and the activation of the "edge labels" I don't know what to do. 

What I want to achieve is, that the amount of people changed is displayed as the strength or the "edge label". I tried different solutions with the aggregate Operator but if I aggregate for example the numer of hotels the "edge label" is for each edge the same...

I hope someone can help me there.

<?xml version="1.0" encoding="UTF-8"?><process version="7.4.000">
<operator activated="true" class="process" compatibility="7.4.000" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="7.4.000" expanded="true" height="68" name="Retrieve Flatfile_Kundenwanderung_20170331 V04 roh" width="90" x="45" y="34">
<parameter key="repository_entry" value="//Local Repository/Flatfile_Kundenwanderung_20170331 V04 roh"/>
<operator activated="true" class="select_attributes" compatibility="7.4.000" expanded="true" height="82" name="Select Attributes" width="90" x="246" y="34">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="Action_2013|Action_2014|Action_2015|Action_2016|CRMPrivatkundeID|PLZ|Land|Kundensegment|Kundengruppe|Haushaltstyp|Geschlecht|Geburtsjahr|DauerKundenbeziehung|Bundesland|Buchungsmuster|Bevorzugtebuchungsart|Alter heute Klasse|Alter heute"/>
<operator activated="true" class="aggregate" compatibility="7.4.000" expanded="true" height="82" name="Aggregate" width="90" x="313" y="238">
<list key="aggregation_attributes">
<parameter key="Action_2013" value="count"/>
<parameter key="group_by_attributes" value="Action_2013"/>
<operator activated="true" class="join" compatibility="7.4.000" expanded="true" height="82" name="Join" width="90" x="514" y="238">
<parameter key="join_type" value="right"/>
<parameter key="use_id_attribute_as_key" value="false"/>
<list key="key_attributes">
<parameter key="Action_2013" value="Action_2013"/>
<operator activated="true" class="transition_graph" compatibility="7.4.000" expanded="true" height="82" name="Transition Graph" width="90" x="447" y="34">
<parameter key="source_attribute" value="Action_2013"/>
<parameter key="target_attribute" value="Action_2014"/>
<parameter key="strength_attribute" value="count(Action_2013)"/>
<connect from_op="Retrieve Flatfile_Kundenwanderung_20170331 V04 roh" from_port="output" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Aggregate" to_port="example set input"/>
<connect from_op="Aggregate" from_port="example set output" to_op="Join" to_port="left"/>
<connect from_op="Aggregate" from_port="original" to_op="Join" to_port="right"/>
<connect from_op="Join" from_port="join" to_op="Transition Graph" to_port="example set"/>
<connect from_op="Transition Graph" from_port="example set" to_port="result 2"/>
<connect from_op="Transition Graph" from_port="transition graph" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>

Best Answer

  • Options
    yyhuangyyhuang Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 364 RM Data Scientist
    Solution Accepted

    Maybe you can add some other "group by" attributes for "aggregate" Operator. You could add the hotel names (hotel_from and hotel_to) so that you have the count of transitions for each paired hotels.


  • Options
    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Hi, there's something wrong with your XML. I can't seem to get it work. Can you export your process and attach instead?



  • Options
    eldenosoeldenoso Member Posts: 65 Contributor I

    Thank you for your replies,

    I attached it to this post. The problem when aggregating is, that the same numer is labeled to every edge instead of given the actual number of transitions.


  • Options
    eldenosoeldenoso Member Posts: 65 Contributor I

    Okay, I get it to work. The solution of grouping was the right one. But I have a question concerning the transition graph and matrix.

    What does the number of edges represent and what is the right way to use this?

    Is there a way of changing the thickness of the edges in relation of how many people changed hotels?


    The operator transition matrix gives probabilities of transitions, but in which context? Since you only give it one attribute?

    Thank you :-)

  • Options
    yyhuangyyhuang Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 364 RM Data Scientist

    Great question! @eldenoso


    The transition graph can use a third column in the example set, that is specified to define the thickness of the edges in relation of how many people changed hotels. In the source code, it is called the "Strength Attribute" used to define the strength of the transision, for example the number of times this transition occurred after an aggregation.


    Our data scientists also developed a RapidMiner HypGraphs Extension for sequential pattern analysis. If possible, could you pm me the data to run your process and test the other options? 





Sign In or Register to comment.