🥳 RAPIDMINER 9.9 IS OUT!!! 🥳
The updates in 9.9 power advanced use cases and offer productivity enhancements for users who prefer to code.
How can I handle missing values for only specific years so I can keep certain examples?
Currently, I am writing my master thesis. I am trying to make a predictive model; however, I am really stuck. I just do not know anymore how to handle the missing values in the exampleset without removing valuable examples from my data. To give you a better idea of how my data looks like and what I mean, I have attached a small part of my dataset.
For example, row 122, is not useful in my opinion as only data on 2017 and 2019 is present. But, row 226 e.g only has 2019 missing. So, I thought maybe I can just delete the rows such as 122 when not sufficient data is available (only two years) but keep a row such as 226 as only one year (2019) is missing. So that I can keep the indicator G3. Is that possible?
Hence, I want to filter out any example that is missing at least X values between 2014- and 2019. But I do not know how to do this and which operator I need for this?
Can anyone please help me out ?
Thank you so much in advance.