TY - GEN
T1 - Clustering based approach to enhance association rule mining
AU - Kanhere, Samruddhi
AU - Sahni, Anu
AU - Stynes, Paul
AU - Pathak, Pramod
N1 - Publisher Copyright:
© 2021 IEEE Computer Society. All rights reserved.
PY - 2021/1/27
Y1 - 2021/1/27
N2 - Association rule mining algorithms such as Apriori and FPGrowth are extensively being used in the retail industry to uncover consumer buying patterns. However, the scalability of these algorithms to deal with the voraciously increasing data is the major challenge. This research presents a novel Clustering based approach by reducing the dataset size as a solution. The products are clustered based on their frequency and price. Another important aspect of this study is to find interesting rules by performing differential market basket analysis to identify association rules which are likely ignored in the trivial approach. When using a cluster-based approach, it is observed that the same set of rules can be generated by using only 7% of the total 16210 items, which in turn directly contributes to reducing the processing overheads and thus reducing the computation time. Furthermore, results obtained from differential market basket analysis have highlighted a few interesting rules which were missing from the original set of rules. A clustering-based approach used in this study not only consists of frequent items but also considers their contribution to the overall revenue generation by considering its price. In addition to this, the least contributing product exclusion rate is also improved from 45% to 93%. These results evidently suggest that the computation cost can be significantly reduced, and more accurate rules can be generated by applying differential market basket analysis.
AB - Association rule mining algorithms such as Apriori and FPGrowth are extensively being used in the retail industry to uncover consumer buying patterns. However, the scalability of these algorithms to deal with the voraciously increasing data is the major challenge. This research presents a novel Clustering based approach by reducing the dataset size as a solution. The products are clustered based on their frequency and price. Another important aspect of this study is to find interesting rules by performing differential market basket analysis to identify association rules which are likely ignored in the trivial approach. When using a cluster-based approach, it is observed that the same set of rules can be generated by using only 7% of the total 16210 items, which in turn directly contributes to reducing the processing overheads and thus reducing the computation time. Furthermore, results obtained from differential market basket analysis have highlighted a few interesting rules which were missing from the original set of rules. A clustering-based approach used in this study not only consists of frequent items but also considers their contribution to the overall revenue generation by considering its price. In addition to this, the least contributing product exclusion rate is also improved from 45% to 93%. These results evidently suggest that the computation cost can be significantly reduced, and more accurate rules can be generated by applying differential market basket analysis.
UR - https://www.scopus.com/pages/publications/85101189105
U2 - 10.23919/FRUCT50888.2021.9347577
DO - 10.23919/FRUCT50888.2021.9347577
M3 - Conference contribution
AN - SCOPUS:85101189105
T3 - Conference of Open Innovation Association, FRUCT
BT - Proceedings of the 28th Conference of Open Innovations Association FRUCT, FRUCT 2021
A2 - Balandin, Sergey
A2 - Deart, Vladimir
A2 - Tyutina, Tatiana
PB - IEEE Computer Society
T2 - 28th Conference of Open Innovations Association FRUCT, FRUCT 2021
Y2 - 27 January 2021 through 29 January 2021
ER -