The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.
Options

How to Extract Numbers from Text Mining

danongdanong Member Posts: 3 Contributor I
edited June 2019 in Help
Hi,

i have tokenize and filtered out some words which left only numbers and english words,

then my problem now is i want to extract out both numbers and english words seperately and putting them in different results,

how can i achieve that?

Btw, i'm using text mining tool here, the file is in .txt format and is semi-structured.


Thanks for helping.

Answers

  • Options
    IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    sorry, I did not get your point. Can you give us an example, best of the data before the desired transformation and what you would like to achieve?

    Cheers,
    Ingo
  • Options
    danongdanong Member Posts: 3 Contributor I
    hi, thanks for reply,

    i had solved the problem actually.

    okay i will rephrase my problem here:


    i had a text file, for example : "Bobbie goes to school today in the morning at 8 oclock with his 30 packs of noodles."

    i would like to filter out english words (bobbie, goes, to ... etc) and as well numberings (8, 30)

    but i found that the filter only allow to do one thing only, either english word or numberings,
    but does not allow for filtering both.


    i could not find other way,
    but lastly i load the file 2 times, and do filtering seperately and i got it solved.


    thanks.
  • Options
    LiZeyuanLiZeyuan Member Posts: 1 Contributor I
    Hey, Mate

    I am a beginner of Rapidminer
    i am facing a similar issue that i want to extract the numerics from the text, eg:
    " the task finished at the year 2018" 
    I just need the numeric information " 2018". how to filter the words when tokenizing?

    Thanks 
    much appreciate 

  • Options
    IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,
    There was similar discussion recently on the community: https://community.rapidminer.com/discussion/55230/how-to-extract-year-from-a-string
    Maybe this can give you some hints.
    Cheers,
    Ingo
  • Options
    kaymankayman Member Posts: 662 Unicorn
    You could have done it in one load also, and use the multiply operator. One port you use to filter 'number style strings', the other to do the opposite. 

    Same outcome of course but only one time dataload.
  • Options
    Ahmedte1234Ahmedte1234 Member Posts: 3 Contributor I
    How can I post question in this forum I need help very much
  • Options
    varunm1varunm1 Moderator, Member Posts: 1,207 Unicorn
    Hello @Ahmedte1234

    Please see below screenshots. You have a big icon "Ask Question" on the top right of this community window. If you click that you can read some quick tips on posting question. You need to provide the title of the question and give a detailed version of your process and issue.



    Once you click this, you get the below screen. Read the three steps provided in the below screen and provide your detailed explanation of the issue.


    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

Sign In or Register to comment.