Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the

**Register**button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.# Comparing the profitability of different rules ?

Correlation
Member Posts:

**7**Contributor II
Hi Everybody,

First post but I've done numerous searches and have tried to sort this problem out before asking.

I hope one of you kind people can help me.

I have market data that's separated into various columns, age, ethnicity, postcode, etc etc..

I have a column to show if a purchase was made (1,0)

I also have a column to show the profit on each transaction.

So far i have managed to find a good model that predicts who is most likely to purchase an item.

What i would like to do is take this one step further and find the best fit for profitability.

Currently I could have a model that finds 60% of people who generally purchase low profitable items.

I would rather a model that finds 40% of people who purchase items with twice the profit of the above example.

Is there an easy way to check all the combinations to produce the most profitable model ?

Thanks in advance and sorry if this has been asked elsewhere before, I'm sure it's a popular request but I couldn't find anything in my searches.

First post but I've done numerous searches and have tried to sort this problem out before asking.

I hope one of you kind people can help me.

I have market data that's separated into various columns, age, ethnicity, postcode, etc etc..

I have a column to show if a purchase was made (1,0)

I also have a column to show the profit on each transaction.

So far i have managed to find a good model that predicts who is most likely to purchase an item.

What i would like to do is take this one step further and find the best fit for profitability.

Currently I could have a model that finds 60% of people who generally purchase low profitable items.

I would rather a model that finds 40% of people who purchase items with twice the profit of the above example.

Is there an easy way to check all the combinations to produce the most profitable model ?

Thanks in advance and sorry if this has been asked elsewhere before, I'm sure it's a popular request but I couldn't find anything in my searches.

0

## Answers

2,531Unicornif I understood you correctly, you have some threshold where a profit is high and where it is not. You are certainly using some parameter value for distinguishing high and low. So you could optimize this parameter value using one of the optimization operators and use a Performance (Costs) operator to calculate the real costs/benefits of each outcome. Was this helpful?

Another possibility might be to use the Threshold finder.

Greetings,

Sebastian

7Contributor IIThanks for replying, I was concerned that I hadn't worded my question very well.

I maybe misunderstanding your advice but I'll try a better explanation.

I've added an example for visualisation.

Imagine 2 columns of data in excel, the first column is a sale column filled with just a 1 or a 0.

1 signifies a sale, 0 means no sale.

The next column called Profit shows the profit made on that sale. ;D

Row # Sale Profit

1 1 15

2 0 0

3 0 0

4 1 5

5 1 5

6 0 0

7 1 10

8 0 0

9 0 0

10 1 20

11 0 0

12 0 0

Using historical data as explained in my first post (which includes the sales column in the example as a label ) I currently use rapid miner to forecast a best fit sale model to see if an application is likely to be a 1 or a 0. The model performs well, but i now wish to take this one step further.

I would like to somehow add in the profit data to the equation (it isn't at the moment) so that the model will predict the best return of sales and profit ... for example the model at the moment ( without a profit column ) could pick rows 1, 3, 4 ,5, 7, 8 which would return 4 sales from 6 calls and a profit of 35 points (which it doesn't know).

I would like the model to check all the sums and variables so the best output could be rows 1, 2, 7, 8,10,12 which is 3 sales from 6 calls but with a profit of 45 points.

I imagine I will need some sort of model that uses 2 labels and maybe a "sum product" equation to check against all combinations ????

I did try using a sum product feature but I couldn't get it to work, please could somebody post an XML example of any ideas they may have.

I really hope this makes sense, Thanks again in advance.

2,531Unicornmight it be a solution to train two independent models? First one for sale or not (0, 1) and then one for the sales amount, possibly trained only on examples representing a sale? You can do this by switching roles using the Set Role operator.

After you applied the two models, you might use the Generate Attribute operator for multiplying the sales column (0 or 1) with the predicted profit to achieve a valid prediction only on the ones where a sale was conducted.

Greetings,

Sebastian

7Contributor IISorry for the delayed reply, I hoped it would be possible to do this in one go.

I'll have a look at the Generate Attribute Operator as suggested.

Correlation