The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.
Options

How to set the whole dataset (1000*100) as a label

TeeHTeeH Member Posts: 18 Contributor II
let's say I have 4 datasets of 1000 rows and 100 columns, and each dataset is a different variable (4 variables), so out of 4, 3 are predictors and one is a target, so how do I set a dataset of 1000*100 as a label so that I can build predicting model using 3 other datasets as predictors, take these datasets as multidimensional dataset
Tagged:

Answers

  • Options
    BalazsBaranyBalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 Unicorn
    Hi,

    what's your use case? Which problem are you trying to solve?

    Traditional machine learning models predict one number (regression) or class (classification) from the predictors. Are you trying to generate an entire dataset here? Including predictors and the label variable?

    On the top of my head I don't know of a way to do this in RapidMiner. This would be advanced hackery with generative neural networks or something like this.

    Regards,
    Balázs
  • Options
    TeeHTeeH Member Posts: 18 Contributor II
    I'm trying to predict vegetation change using climate variable, I'm using zonal statistics generated from a multi-dimension dataset, is a spatiotemporal dataset, comprised of the time dimension, pixel id value, and the value for each variable
  • Options
    TeeHTeeH Member Posts: 18 Contributor II
    follow up on the first quest, apart from prediction, is it possible to generate a new dataset from the other dataset, maybe by just carrying out a simple computation like addition, is it possible?
  • Options
    BalazsBaranyBalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 Unicorn
    Oh, so your target variable is the vegetation change in every area one by one.

    Predicting every 1000x100 area one by one will take time but that's what one would do in this situation.

    I guess you have historic data for each area (selected by x, y coordinates for example). Like change 4 years ago, 3 years ago, etc. In this case you would build a model for each area, maybe taking into consideration the neighboring cells. 

    I would start with a small part (not 100,000 models at once) or aggregate areas into larger ones to get a more robust prediction.

    Regards,
    Balázs
  • Options
    TeeHTeeH Member Posts: 18 Contributor II
    yeah! I did aggregate but I wanted to do prediction at pixel size, maybe I should try using image operators...
Sign In or Register to comment.