Label new data given historical data distributions

asav_yu · September 2019

I have a list of products that are to be sold this year and I run a model to price these products. I know everything about them including location they will be sold at and condition (1-10).

I want to do a forecast of average sale price for next year. I have a list of products for next year but I do not know the location or the condition of these products. How can I add location and condition info assuming same distribution for next year as this year. Product condition follows normal distribution and for location I got info like 20% are sold at this location etc.

I got ideas on how to do it in excel but wondering if there is a more scientific way to do it in RM? Appreciate your help!!

yyhuang · September 2019

Hi @asav_yu,

suppose you are talking about predicting sale price of vintage/second-hand products, you can use similarity analysis to create the labels of new data.

Your feature set includes sold locations, condition, maker, descriptions, etc. By similarity measurements, you could find the substitute products with small distance to the target product. For one target product in next year, you may get 3-5 nearby "neighbors" (by similarity) sold in this year. Then take a weighted average as the estimation.

Cheers,

YY

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Label new data given historical data distributions

Answers