Options

FPGrowth algorithm with the existence of missing values

zazass8zazass8 Member Posts: 1 Learner I
I am trying to implement the fpgrowth algorithm on a dataset that is already binarised, but also contains some missing values as well. Instead of applying data imputation techniques, I believe it will be better to find a way to compute support and confidence metrics by ignoring the missing values. For example if for item A, I have 4 occurencies out of 10 transactions and 2 of them are missing, then the support should be 4/8 instead of 4/10. And we will do this for all itemsets. I tried to edit the open source code of the fpgrowth algorithm from the mlxtend library, but I see that's very hard to do the code is very abstract in general. Has anyone found a way on how to solve this issue? I know @MattTC13 made exactly the same question on this forum, a few years ago if you have a solution it would be great for you to share it!
Sign In or Register to comment.