mine service tickets

DataFighter · May 2016

We have an old ticket management system that has very few structured fields.

The only fields where valuable info is, are Summary, Remarks and a Memo field which contains a detailled description of ticket (problem, observable cause, failure modes, planning details, execution details as well as worker's feedback)

I'm looking for a way to spit out the main causes for these tickets as well as other types of information.

Any ideas on how I can do this using RapidMiner?

P.S.: I'm new to machine learning. So please don't be too hard on me!

bhupendra_patil · May 2016

Hello DataFighter,

This a blog that talks about how to do text mining with Rapidminer.

http://vancouverdata.blogspot.co.uk/2010/11/text-analytics-with-rapidminer-loading.html

Its pretty detailed and should cover all aspects of text mining that will be needed for your case.

please let us know how you progress

Additonally there are dozens of other resources on textmining available when you search. Not seen them, but could be handy.

Thomas_Ott · May 2016

You can try this sample process. It uses word clustering and association rules.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="7.1.001">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.1.001" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="retrieve" compatibility="7.1.001" expanded="true" height="68" name="Retrieve StoredMinutesPDF (2)" width="90" x="45" y="34">
        <parameter key="repository_entry" value="../data/StoredMinutesPDF"/>
      </operator>
      <operator activated="true" class="set_role" compatibility="7.1.001" expanded="true" height="82" name="Set Role" width="90" x="179" y="30">
        <parameter key="attribute_name" value="label"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="select_attributes" compatibility="7.1.001" expanded="true" height="82" name="Select Attributes" width="90" x="313" y="30">
        <parameter key="attribute_filter_type" value="value_type"/>
        <parameter key="value_type" value="numeric"/>
        <parameter key="include_special_attributes" value="true"/>
      </operator>
      <operator activated="true" class="select_attributes" compatibility="7.1.001" expanded="true" height="82" name="Select Attributes (2)" width="90" x="447" y="30">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="sentiment"/>
        <parameter key="invert_selection" value="true"/>
      </operator>
      <operator activated="true" class="multiply" compatibility="7.1.001" expanded="true" height="103" name="Multiply" width="90" x="514" y="210"/>
      <operator activated="true" class="transpose" compatibility="7.1.001" expanded="true" height="82" name="Transpose" width="90" x="715" y="30"/>
      <operator activated="true" class="x_means" compatibility="7.1.001" expanded="true" height="82" name="X-Means" width="90" x="849" y="30"/>
      <operator activated="true" class="select_attributes" compatibility="7.1.001" expanded="true" height="82" name="Select Attributes (3)" width="90" x="648" y="300">
        <parameter key="attribute_filter_type" value="subset"/>
        <parameter key="attribute" value="cluster"/>
        <parameter key="attributes" value="|sentiment|cluster"/>
        <parameter key="invert_selection" value="true"/>
        <parameter key="include_special_attributes" value="true"/>
      </operator>
      <operator activated="true" class="numerical_to_binominal" compatibility="7.1.001" expanded="true" height="82" name="Numerical to Binominal" width="90" x="648" y="390"/>
      <operator activated="true" class="fp_growth" compatibility="7.1.001" expanded="true" height="82" name="FP-Growth" width="90" x="648" y="480">
        <parameter key="min_number_of_itemsets" value="10"/>
        <parameter key="max_items" value="5"/>
      </operator>
      <operator activated="true" class="create_association_rules" compatibility="7.1.001" expanded="true" height="82" name="Create Association Rules" width="90" x="782" y="480"/>
      <operator activated="true" class="item_sets_to_data" compatibility="7.1.001" expanded="true" height="82" name="Item Sets to Data" width="90" x="916" y="544"/>
      <connect from_op="Retrieve StoredMinutesPDF (2)" from_port="output" to_op="Set Role" to_port="example set input"/>
      <connect from_op="Set Role" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
      <connect from_op="Select Attributes" from_port="example set output" to_op="Select Attributes (2)" to_port="example set input"/>
      <connect from_op="Select Attributes (2)" from_port="example set output" to_op="Multiply" to_port="input"/>
      <connect from_op="Multiply" from_port="output 1" to_op="Transpose" to_port="example set input"/>
      <connect from_op="Multiply" from_port="output 2" to_op="Select Attributes (3)" to_port="example set input"/>
      <connect from_op="Transpose" from_port="example set output" to_op="X-Means" to_port="example set"/>
      <connect from_op="X-Means" from_port="cluster model" to_port="result 1"/>
      <connect from_op="Select Attributes (3)" from_port="example set output" to_op="Numerical to Binominal" to_port="example set input"/>
      <connect from_op="Numerical to Binominal" from_port="example set output" to_op="FP-Growth" to_port="example set"/>
      <connect from_op="FP-Growth" from_port="frequent sets" to_op="Create Association Rules" to_port="item sets"/>
      <connect from_op="Create Association Rules" from_port="rules" to_port="result 2"/>
      <connect from_op="Create Association Rules" from_port="item sets" to_op="Item Sets to Data" to_port="frequent item sets"/>
      <connect from_op="Item Sets to Data" from_port="example set" to_port="result 3"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
      <portSpacing port="sink_result 4" spacing="0"/>
    </process>
  </operator>
</process>

DataFighter · May 2016

Thanks TBone,

What format is attribute "label".

WIthout sample data, it's hard for me to understand what's going on and what I should be using in which operators

Sorry, as I mentionned earlier, I'm new to machine learning and text mining

DataFighter · May 2016

Thanks Bhupendra_patil,

I've looked at some of the videos and I got stuck at stemming.

Our database is in french.

Are there any stemming operators made for french language?

... Nevermind, just found Snowball stemming!

JEdward · May 2016

As mentioned start with text mining & clustering to try to get summaries of the problems grouped together in categories.

One thing you don't mention having is timestamps of the tickets if you do maybe you can also use association rules or clustering to see what problems seem to happen around certain times and investigate potential correlations & causes. (for example on humid days the electronics of the computers run slowly and crash more often)

Have a look at the website www.rapidprom.org for inspiration on what you'll be able to do when you have the tickets all cleaned up. It might give you some nice ideas for the next step of your internal ticket management systems.

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

mine service tickets

Answers