Generating Synthetic Data or Simulated Data
I am new to RapidMiner but not new to data science. Synthetic data has its uses in developing data science solutions. I am looking for the best RapidMiner approach to simulate booking events, such as airline bookings. As an example consider a single flight, each day a certain number of passengers book or cancel for this flight. If the flight leaves say 3/1/2019, the bookings could start coming in about 60 days prior, say 1/1/2019 and continue booking through the days leading up to the flight. So I have 60 booking days and one flight. In principle this is easy to simulate, even in Excel.
Imagine now that I have a hundred flights and a 60 day booking window. With a page of Python/Pandas I can quickly create this synthetic data, with different booking characteristics for each of my flights depending on flight date, origin and destination, among other factors.
How should I conceptually get started with this in RapidMiner Studio? I can assure you I have rummaged through the nodes named "Generate" but I did not see an obvious and simple way to go about this. I am sure I must have missed something. This is where RapidMiner experts like you, dear Reader, can be very helpful. I am looking for some guidance, not a full solution. Many thanks.