cc 15(5): e2

Research Article

Automated Dimension Determination for NMF-based Incremental Collaborative Filtering

Download985 downloads
  • @ARTICLE{10.4108/eai.17-12-2015.150804,
        author={Xiwei Wang and Jun Zhang and Ruxin Dai},
        title={Automated Dimension Determination for NMF-based Incremental Collaborative Filtering},
        journal={EAI Endorsed Transactions on Collaborative Computing},
        keywords={auxiliary information, incremental clustering, data growth, collaborative Filtering, NMF},
  • Xiwei Wang
    Jun Zhang
    Ruxin Dai
    Year: 2015
    Automated Dimension Determination for NMF-based Incremental Collaborative Filtering
    DOI: 10.4108/eai.17-12-2015.150804
Xiwei Wang1,*, Jun Zhang2, Ruxin Dai3
  • 1: Department of Computer Science, Northeastern Illinois University, Chicago, Illinois 60625, USA
  • 2: Department of Computer Science, University of Kentucky, Lexington, Kentucky 40506-0633, USA
  • 3: Department of Computer Science and Information Systems, University of Wisconsin River Falls, River Falls,Wisconsin 54022, USA
*Contact email:


The nonnegative matrix factorization (NMF) based collaborative filtering t e chniques h a ve a c hieved great success in product recommendations. It is well known that in NMF, the dimensions of the factor matrices have to be determined in advance. Moreover, data is growing fast; thus in some cases, the dimensions need to be changed to reduce the approximation error. The recommender systems should be capable of updating new data in a timely manner without sacrificing the prediction accuracy. In this paper, we propose an NMF based data update approach with automated dimension determination for collaborative filtering purposes. The approach can determine the dimensions of the factor matrices and update them automatically. It exploits the nearest neighborhood based clustering algorithm to cluster users and items according to their auxiliary information, and uses the clusters as the constraints in NMF. The dimensions of the factor matrices are associated with the cluster quantities. When new data becomes available, the incremental clustering algorithm determines whether to increase the number of clusters or merge the existing clusters. Experiments on three different datasets (MovieLens, Sushi, and LibimSeTi) were conducted to examine the proposed approach. The results show that our approach can update the data quickly and provide encouraging prediction accuracy.