The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.
Options

How do I load the data stored by a store operator outside RapidMiner?

naveen_bharadwanaveen_bharadwa Member Posts: 9 Contributor I
edited November 2018 in Help

Hey, 

I have stored object that's around 10GB in size. When I try to process on this object, RapidMiner stops and responds back saying there isn't enough memory to continue this process. Is there a way I can load the data into a python/java object so that I can perform the operations I want on it. 

 

Regards,

Naveen

Answers

  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    What type of object is it?  There have been a few related threads on this topic recently.  Assuming this is a dataset of some kind, and you don't actually need it all in memory at the same time, your best option would be either to split it into separate smaller sets (csv files, database tables, and then use one of the Loop operators to cycle through them.  But if you actually need everything in memory at the same time (e.g., for model training) then you are probably going to have to set up RapidMiner Server or use RapidMiner Cloud to take advatage of a machine with more available RAM.

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Options
    naveen_bharadwanaveen_bharadwa Member Posts: 9 Contributor I

    This is actually the result of the apply model operator. This is the data I need to make inferences on. I've tried splitting the stored object, but it fails even before loading the entire object into the memory. So, I was wondering if there was a way to load the locally stored objects and parse it to get the data out in some form by removing the concepts that I don't want.

  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    I don't think there is a way to do this splitting during the apply model.  But why not split the data before you apply the model?  Then you can remove extra attributes that are no longer needed after scoring and join/append all the results back together again.

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Options
    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    @naveen_bharadwa I think what @Telcontar120 is alluding to is trying to be thrifty with your data analysis. A 10GB data objects sounds like it's filled with a lot of unnecessary 'stuff.'  Have you checked your process to see where  you can cut unnesscary data? If you really can't, then you're starting to move into the Hadoop/Radoop area with analysis, so you might consider using that route. 

Sign In or Register to comment.