Options

Meta Data Calculation

cherokeecherokee Member Posts: 82 Maven
edited November 2018 in Help
Hi!

Is there a way to turn of the meta data calculation or handling? I don't know where the problem is i have.

First of all let me state that I really appreciate this feature. But I recently wanted to try something which involved a huge amount of data (3,500 examples, 60,000 attributes). As I don't have the data up to now I wanted to use an ExampleSetGenerator for this. IT is simply imposible! When I add it to the process and adjust the parameters accordingsly, RM nearly stops working. Every parameter change and some clicks take about 5 minutes to proceed.

This is especially funny as I noticed that my PC has to few RAM to process this set up (example generation only)! So what is the metadata generation working on?

[Edit:] Even 10,000 attributes (which my PC can handle) take the same time!

Best regards,
Michael

Answers

  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Michael,
    this problem arose from the way the example set generators did build the meta data: They simply constructed the data and then extracted the meta data. This get's a time consuming way, if the number of examples or attributes is high. We have already changed this behavior in the recent builds and it will be included in the upcoming beta version.
    To get around this problem, you might generate the exampleset once and store it in the repository. Then you could use the repository source for loading the data again. This will avoid generation and since the repository saves all meta data, you don't loose it.

    Greetings,
      Sebastian
  • Options
    cherokeecherokee Member Posts: 82 Maven
    Thank you very much for your quick response.

    Best regards,
    Michael
Sign In or Register to comment.