FPGrowth algorithm with the existence of missing values

zazass8 · December 2022

I am trying to implement the fpgrowth algorithm on a dataset that is already binarised, but also contains some missing values as well. Instead of applying data imputation techniques, I believe it will be better to find a way to compute support and confidence metrics by ignoring the missing values. For example if for item A, I have 4 occurencies out of 10 transactions and 2 of them are missing, then the support should be 4/8 instead of 4/10. And we will do this for all itemsets. I tried to edit the open source code of the fpgrowth algorithm from the mlxtend library, but I see that's very hard to do the code is very abstract in general. Has anyone found a way on how to solve this issue? I know @MattTC13 made exactly the same question on this forum, a few years ago if you have a solution it would be great for you to share it!

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

FPGrowth algorithm with the existence of missing values