🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉

Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.

CLICK HERE TO DOWNLOAD

Outlook Emails exported to excel to be grouped by Subject

lovelikecheeselovelikecheese Member Posts: 2 Contributor I
Hi Rapidminer Community,

I came across the Aggregate function and would like to group by the Subject type. However, I do realise that the group by function works differently from the Group by function in Outlook. 

Take the below email subject type for instance:
Email 1: This is it
Email 2: Re: This is it
Email 3: FW: This is it


In Outlook, the 3 emails above are grouped under 1 subject in Outlook, however, the Aggregate function reads in as 3 different subjects.

Is there any workarounds to this? 


Thank you!

Best Answer

  • BalazsBaranyBalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 712   Unicorn
    Solution Accepted
    Hi!

    One way to solve this is using the Replace operator. This will use a regular expression to find the text you're looking for and replace it with the text you specify.

    You're looking for characters followed by a colon and whitespace in the beginning of the subject, possibly repeated. A regular expression for this is the following:

    ^([A-Z][A-Za-z]+: *)*

    The replacement in this case would be the empty string (just leave "replace by" empty).

    Regards,
    Balázs

Answers

  • lovelikecheeselovelikecheese Member Posts: 2 Contributor I
    Thanks @BalazsBarany!

    This does remove and grouped the emails as 1 subject, but I'm having a problem now...

    Say the total counts of email I've received between 1st July - 31st Oct is 4070 emails, after using replace by, I've gotten 4411 emails instead.



  • BalazsBaranyBalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 712   Unicorn
    Hi,

    just set breakpoints before and after your operators and check which operator causes the duplicated data. Just press F7/Shift+F7 or use the right-click menu on the operator.

    Replace doesn't change the number of examples, so it has to be something different.

    Regards,
    Balázs
Sign In or Register to comment.