I'm currently investigating the possibility of developing a model that will look at an IT ticketing system, that tracks things like vendor hardware and software issues. I'm new to text mining and I was wondering if its possible to parse attributes that contain error messages like the windows BSOD type messages. where the text may contain both regular and extended ascii characters.
Is this a type of workload used in text analytics, / am I looking for a long difficult task of parsing error messages? I'd greatly appreciate any feedback.
Re: text mining salesforce / remedy ticketing systems
As RapidMiner was developed in Germany, support for different encodings and extended characters was built into it from the start. I applied the text mining methods to Japanese texts without any problems.
Your use case is very interesting.
For Salesforce there is an operator for reading and writing data directly, similar to SQL databases. If the ticket system has an API (or you can access its database), you can also integrate the text of the tickets.
With analyzing this kind of data, you can look at the content of the message (the words and characters in it) but also look for structure (e. g. the BSOD probably has a fixed format which you can find with a regular expression). Combining the structured and the unstructured properties gives you a better model.