Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

How can the Generate Data operator be configured to create right or left skewed distributions?

michaelglovenmichaelgloven RapidMiner Certified Analyst, Member Posts: 46 Guru
How can the Generate Data operator be configured to create right or left skewed distributions? I see the normal distribution and would like to adjust this for skew and well as trimming negative tails.

Best Answer

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,529 RM Data Scientist

    i think you cannot. What kind of distribution would you need?

    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • michaelglovenmichaelgloven RapidMiner Certified Analyst, Member Posts: 46 Guru
    Hi Martin, we've created a Monte Carlo process in RM and have been able to successfully apply a single gaussian cluster to input data to our function. However, some of the input data by it's nature is a right or left skew distribution and\or must be non-negative. The practical use case is this - when we calculate the maximum burst pressure of steel pipe using a specified pipe property from the mill (let's say 60,000 psi), it is very unlikely the specified property is an average. The 60,000 psi more often represents maybe 2-3 standard deviations to the left and the actual average is probably 68,000 psi with a small tail to the right capped at some maximum like 70,000 psi. So, we're trying to figure out how to shape a normal distribution with an understanding what our tails look like. I found ways to do this on stack exchange but the math requires calculus (which I don't know how to do easily in RapidMiner), and we're not trying to get an exact solution, just something that better represents reality.
Sign In or Register to comment.