The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.

Make Sense of Your Multilingual Name Records with Rosette for RapidMiner!

Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
edited November 2018 in Knowledge Base

Try the Name Translation operator in our Rapidminer plugin and find duplicates easily, even across different languages and scripts

By: Basis Tech


Guess what? We work in a global economy. You probably aren’t surprised to hear that; it’s quite common for companies, even tiny startups, to have partners and customers all over the world. Internationalization can be a boon for both innovation and business, but it also brings many challenges. If you have multilingual sales reps chasing leads across a wide range of countries you’re likely to be confronted with a CRM database that contains a variety of different languages and scripts.


Perhaps you work in marketing and you’d like to prepare a report on all your partners in Asia-Pacific. However, your CRM data contains person, organization, and location names written in Japanese, Korean, and Chinese. This is an issue we often run into on the Rosette team; we have offices in Israel and Tokyo and customers all over the world. Thankfully, multilingual text analytics is also our specialty.


Today we’ll show how you can use Rosette in tandem with the popular predictive analytics platform RapidMiner Studio to translate multilingual names in CRM data, using examples from our own SalesForce records. 


Get the complete picture

Multilingual CRM data poses many challenges. First, it can be difficult to grasp the full scope of a business relationship when multiple languages are involved. Take, for example, the international company Hitachi. Our Salesforce contains records from discussions we’ve had with Hitachi in the United States are written in English, but our Japanese office tracks their work with the company under 日立. You’d need to unify both accounts to get the global picture.


While some companies might prefer that all CRM data be recorded in English, it can be valuable for sales reps and field engineers to have customer information handy in their native tongue. It’s also useful to ensure that a customer’s name and address is recorded properly in its original language, saving you from possible errors of reverse translation later.   


Our Salesforce also contains a variety of customer interaction records, including emails, support tickets, event registrations, and requests for information. Often this data arrives in English, but not always! In order to ensure that these records are associated with the correct account, we often need to translate the person, organization, and location names, which will allow us to correctly route the information and address it. For example, the screenshot below shows a new subscriber notification we got from Rosette API that will be automatically registered in our CRM:


Screen Shot 2017-04-07 at 5.09.21 AM.png



In the next section we walk you through a simple RapidMiner process that draws data from our Salesforce into RapidMiner and uses Rosette’s name translation functionality to identify records that belong to the same account.

Using Rosette Name Translation with RapidMiner

Wait! If you haven’t downloaded the Rosette extension on the Rapidminer marketplace and signed up for your free Rosette API key, follow the steps listed in our Quick Start guide to extracting entities in RapidMiner Studio with Rosette


Here’s a sample process we’ve used to help us deal with multilingual Salesforce data:

 Screen Shot 2017-04-07 at 5.11.27 AM.png



Starting with the “Read Salesforce” operator, we pulled data into RapidMiner Studio, selected the specific fields/attributes we wanted to have appear in our results, and ran Rosette Name Translation. You’ll notice that we used two “Translate Names” operators, one for each of the attributes we wanted to translate. In this example, we used “Name” and “AccountName (LocalLang).”


Screen Shot 2017-04-07 at 5.12.26 AM.pngThese aren’t real names from our SalesForce, but the accounts are, so we’ve blurred them out for confidentiality purposes.


When we run the process, our results show two new attributes “LastName (English)” and “AccountName(English).” The names above are a mix of Japanese, Chinese, and Korean. All have been translated to English and are ready for further processing and analysis.

Next Steps

When working in an complex, international environment, you need to properly route and analyze CRM (or help desk records), for which the data needs to be clean. Thanks to our name translation tool, whether you use it with RapidMiner Studio or any other platform, you can ensure that your data contains one unique record per customer or prospect alongside a standardized translation for all your cross-lingual entries.


For further information on how Name Translation could apply to your use case, don’t hesitate to contact us at





  • Options
    yeshuuraoyeshuurao Member Posts: 2 Newbie
    This technique seems to be very impressive !

    Open source CRM software is a great fit for technically inclined companies that want to adjust their software’s code themselves and are less concerned about tech support and manual upgrades.

    Feel free to checkout our website: Field Engineer - Best freelance site for telecom engineer.
Sign In or Register to comment.