Prediction for next orders, any ideas?

EJECTY · April 2017

Dear Community!

I have a .csv file with 100.000 rows and 439 columns. This spreadsheet represents the customers' habits for using a specific service. For each rows there is an ID for every customer and every transaction date with the following format: 1 for Monday, 2 for Tuesday... etc. I need to predict the next date of transaction for every customer, using these past records.

Here's an example for the format of the database:

customer_id transaction1 transaction2 ... transaction438
1 1 2 3 4 5 6 7 ... 745 746 747
2 2 7 16 20 21 23 28 ... 412
3 1 2 3 4 5 6 7 ... 285 322
4 5 7 8 12 14 19 21 ... 924 925 926

Any ideas what model should I use for this prediction for the best accuracy?

NOTE: The database have lots of missing values depends on the frequency of ordering.

Thomas_Ott · April 2017

This looks like some sort of sales projection analysis. I would look at the process I shared here: http://community.rapidminer.com/t5/RapidMiner-Studio/How-to-get-forecast-values-of-future-from-time-series-data/m-p/37698

You would need to do a bit of missing value replacements using the Replace Missing Values operator and need to install the Series extension from our marketplace. Is there seasonality involved?

EJECTY · April 2017

It is a homework at the university, we are learning the basics of RapidMiner. We needed to do similar examples earlier, but there was a label column for the learning database, but this time I have no clue, how I could predict the possible outcome without that special column. I thinked about some sort of pattern analysis, or converting the database to a range from 1 to 7 to simplify the problem, but I couldn't move along to a real solution.

I think seasonality doesn't matter, because it's just an example.

Thomas_Ott · April 2017

If it's sales, you could sum up the values and do a Total Sales per month or week? You can use the dates as your ID and then the Total Sales as you Label.

EJECTY · April 2017

Because the database contains the days of transaction in a code format, not the quantity, making totals is not possible or making sense.

Thomas_Ott · April 2017

AH! Did you try the Generalized Sequential Patterns operator?

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Prediction for next orders, any ideas?

Answers