Options

Information for bachelorthesis

hoetzelshoetzels Member Posts: 2 Contributor I
edited November 2018 in Help

Hello everybody,

at the moment I'am writting my bachelorthesis for a german company.

My subject is to show some possibilities how huge amounts of data can be summarized. The data aren't stored in a database, they arrive for example in a email box with pdf-format or office(word/excel)format. The person who sends the data shouldn't have any work to change or fit the data in a special format.

Is it possible to use a rapidminer programm to get the crucial information out of a mass of data? and can I track information back to the document??

I would be very greatful if i get some inforamtions.

Thanks

Answers

  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,

    yes this is possible in general. All you need is to design a process that can extract the important content from the text documents. If you then install a RapidAnalytics, it can automatically listen to an email box and retrieve and process each incoming mail.
    The real problem lies in finding a good data mining process for the content extraction...

    Greetings,
      Sebastian
  • Options
    hoetzelshoetzels Member Posts: 2 Contributor I
    Thanks for the response,


    what do you mean with good data mining process (Just in a few words)?
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    that's easy: A good process is a process that fulfills all goals of a given task with a low memory consumption and runtime. Some non functional properties like an easy process setup to make it easy to maintain can be added, too.

    Greetings,
    Sebastian
Sign In or Register to comment.