Search

Re: "Creating SVDs in X-Validation operator very slow"

Comment by text_miner · February 2010 · Home› Help

Sebastian, I agree, a warning would be nice. In addition, another thing to consider is changing the TFIDFFilter class to set zeros for columns without any counts. Although the missing values can currently be changed to zeros with the Replace Missing Values operator, this (1) requires the use of another operator and (2)…
[Solved] Problem with TF IDF calculation

Discussion by daniel · August 2012 · Home› Help

Hello everyone, I am currently working on a task where I have to resample some data. Because I am unsure if it's okay to use a method like SMOTE on already calculated tfidf weights I wanted to calculate the term occurances in Rapidminer, export and smote the data and later import it and calculate the TFIDF weights. When I…
Re: Extracting the most representative 10 keywords from web page

Comment by Telcontar120 · September 2017 · Home› Help

As @Thomas_Ott suggests, this is definitely possible, but it will require a series of operators. Working with text from web pages can be quite tricky because of all the extra html and formatting. It also depends on what you mean by "10 most representative" words. Many times, the most frequent words are not necessarily the…
Process Document from Data

Question by waqaskhan343 · April 2018 · Home› Help

Hello, Everyone! I am very beginner in rapid miner and doing a sentiment analysis on tweets. I have a problem at a basic level. I am using a tool process document data to generate tf-idf vector and word counts after cleaning the tweets. I have opened an excel file which containing 2000 tweets with reading excel…
Text clustering and labeling

Question by amir_askary_sha · October 2017 · Home› Help

Hi, I'm using Rapidminer for text clustering (kmeans) and then labeling the clusters. We have usually around 2000 documents and the texts are in German. The texts are short (title and short description of news or articles) and so far Rapidminer is working nice! In the text processing phase, I use Term Frequency vectors,…
Re: Text Pre-processing

Comment by Telcontar120 · June 2020 · Home› Help

You need to download and install the free text mining extension from the marketplace. The operator "Process Documents" will generate a word vector using term frequency if you set that as the option in the parameters (TF-IDF is the default), and it will also automatically generate the bag of words for you if you use the…
[SOLVED] Rename regular attributes generated by Text Processing

Question by Ruca · March 2012 · Home› Help

Hi all, I'm a newbie in using RapidMiner. I hope I'm placing my issue in the right place. But, first of all let me congratulate the support team for lunching this forum. I hope I can contribute also to solve other issues. Going back to my problem. I'm using the Text Processing module in order to create term vector…
Re: [SOLVED] Transform Document-Term matrix to flat table?

Comment by RWingerter · April 2013 · Home› Help

Hi Marcin, thanks for your reply. Here is my example data and my simple process. The input is a list of user queries (query_id, query, frequency), which is processed with "Process Documents from Data". The result is a word list and a document-term matrix. In addition, I would like to get a term-document table with Term,…

2 results
LOF on Text Data

Question by tamberge · April 2019 · Home› Help

Hello Team, I am fairly new to RM and currently conducting some research on online text. In particular I am trying to detect outliers from an set of documents by using the LOF operator. Now I have some troubles, since the LOF for each document is very close to 1, no matter how I set the MinPtsUB and MinPtsLB. Basically I…
"Generate pivoted example set from word vector"

Question by Chiko · May 2016 · Home› Help

Hi, I have got some text entries in an excel worksheet that I would like to text mine and find associations(if any) between some words. So my initial thinking was to process the text into Process Documents from Data->Convert WordList to Data and then Pivot it. The problem is after processing documents, I only get a word…

36 results

Howdy, Stranger!

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Re: "Creating SVDs in X-Validation operator very slow"

[Solved] Problem with TF IDF calculation

Re: Extracting the most representative 10 keywords from web page

Process Document from Data

Text clustering and labeling

Re: Text Pre-processing

[SOLVED] Rename regular attributes generated by Text Processing

Re: [SOLVED] Transform Document-Term matrix to flat table?

LOF on Text Data

"Generate pivoted example set from word vector"