Generating Synthetic Data or Simulated Data

omar_a_karimomar_a_karim Member Posts: 2 Contributor I
edited December 2018 in Help

I am new to RapidMiner but not new to data science.  Synthetic data has its uses in developing data science solutions.  I am looking for the best RapidMiner approach to simulate booking events, such as airline bookings.  As an example consider a single flight, each day a certain number of passengers book or cancel for this flight. If the flight leaves say 3/1/2019, the bookings could start coming in about 60 days prior, say 1/1/2019 and continue booking through the days leading up to the flight. So I have 60 booking days and one flight.  In principle this is easy to simulate, even in Excel. 

 

Imagine now that I have a hundred flights and a 60 day booking window. With a page of Python/Pandas I can quickly create this synthetic data, with different booking characteristics for each of my flights depending on flight date, origin and destination, among other factors. 

 

How should I conceptually get started with this in RapidMiner Studio? I can assure you I have rummaged through the nodes named "Generate" but I did not see an obvious and simple way to go about this.  I am sure I must  have missed something.  This is where RapidMiner experts like you, dear Reader, can be very helpful.  I am looking for some guidance, not a full solution.  Many thanks. 

Tagged:

Best Answer

  • SGolbertSGolbert RapidMiner Certified Analyst, Member Posts: 344 Unicorn
    Solution Accepted

    Hi Omar,

     

    there is a very simple way that's right on your hands: you can use the same Python scripts that you have been using. You just need to install the Python scripting extension.

     

    I avoid repeating myself or reinventing the wheel as much as I can, so I think that in your case it is also the "expert" solution.

     

    Regards,

    Sebastian

Answers

  • omar_a_karimomar_a_karim Member Posts: 2 Contributor I

    Thanks Sebastian. Of course that makes a lot of sense - using the encapsulated Python.  I am able to do this, yes. I will explore the capabilities of the scripting extensions some more as well.  

Sign In or Register to comment.