Text mining of crowdfunding data, including numerical metadata

seba77seba77 Member Posts: 2 Contributor I
edited January 2020 in Help

I'm trying to analyze crowdfunding datasets for a study project. 
The dataset shows in each row -amongst others-, a descriptions of the campaigns and how much money was raised for that campaign.
The goal is now to analyze the total occurence of certain terms in the dataset by using text mining. 
The basic text mining process is not the problem. So as an example, I found out that the term "android" exists in 150 crowdfunding campaigns. 
Now it would be interesting to find out how much money was spent on the campaigns that contain this word.
So, in theory, adding up the numbers from the raised money cell of every campaign that contains the word "android".
The goal is then to get a result like this, (so an additional column that shows the totalmone raised)

word attribute name document occurences total money raised
android android 150 10.000
wordpress wordpress 120 8.000

Is this possible with text mining?
Thank you in advance!


  • Options
    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Yup, use a Wordlist to Data operator to create a data table, then use a Generate Attributes operator to create a new attribute column named "total money raised" and create a function to generate your $$$.


    Or you can use an Aggregate operator too

Sign In or Register to comment.