GSP - wrong results?

AnnaAnna Member Posts: 5 Contributor I
edited November 2018 in Help

 The max-gap in the GSP operator does not seem to be working well. 

 

I tried the GSP operator on a small data file:

IP Timestamp Page
170 10 a
170 20 c
170 30 e
170 40 f
17 10 a
17 15 c
17 20 f
116 10 a
116 30 c
116 50 e
116 70 d
185 10 a
185 20 c
185 30 f
185 40 e
185 50 b
185 60 e

 

When I run it with window=0, min gap=1, max gap=1000, sup=0.5 I get the right results:

 

GSPSet

1.000: <Page = a>  <Page = c>  
0.750: <Page = a> <Page = e>
0.750: <Page = a> <Page = f>
0.750: <Page = c> <Page = e>
0.750: <Page = c> <Page = f>
0.750: <Page = a> <Page = c> <Page = e>
0.750: <Page = a> <Page = c> <Page = f>

when I run it with window =0, min gap=1, max gap=10, sup = 0.5, I get:

GSPSet

1.000: <Page = a>  <Page = c>  
0.750: <Page = a> <Page = e>
0.750: <Page = a> <Page = f>
0.750: <Page = c> <Page = e>
0.750: <Page = c> <Page = f>

and the right result should be:

GSPSet

0.750: <Page = a>  <Page = c>  
0.50: <Page = c> <Page = f>
For example pattern <a, e> is not supporetd at all because in case of 170 the gap is 20, in case of 116 the gap is 40, and in case of 185 the gap is 30 and 50.  So each of the gaps is larger than max gap of 10. 
Is this operator not implemented well in Rapid Miner?



 

Sign In or Register to comment.