Time-aware Collaborative Topic Regression: Towards Higher Relevance in Textual Items Recommendation

July, 12 th 2018 Time-aware Collaborative Topic Regression: Towards Higher Relevance in Textual Items Recommendation BIRNDL 2018, Ann Arbor Anas Alzogbi University of Freiburg Databases & Information Systems

Motivation More than 100K Paper in computer science are published yearly 3 times more papers in 2010 than in 2000 [Onur, 2015] W h i c h p a p e r s a r e r e l e v a n t t o m e! 2

Problem Definition Set of n users Set of m papers Users interactions with papers with timestamps Ratings Matrix with ones at the positive interactions and zeros elsewhere Users 1 2... n Papers 1 2 3 4 5 m???????????? Is a positive Rating (1)? Is an unknown rating (0) Predict the unknown Ratings 3

Collaborative Topic Modelling (CTR) [Wang, 2011] Hybrid recommendation Builds on Matrix Factorization and extends it to benefit from items textual content Jointly learns items and users latent factors from the rating matrix and the document matrix Rating Matrix Users Latent Matrix CTR Document Matrix Papers Latent Matrix All r a t i n g s h a v e t h e s a m e i m p o r t a n c e! Prediction Matrix 4

Concept drift in user interest User interest might change over time i 1 i 2 i 6 i 3 i 4 i 5 i 7 Time i 1 i 2 i 3 i 4 i 5 Recommendation requested i 6 i 7 papers Corpus Not all ratings represent the actual user interest in the same extent Rating s time should be considered when learning the user latent model! 5

Time in Recommender Systems timesvd++ [Koren, 2010] MF approach, learns time-based biases along learning users and items latent factors Predictions can be computed only for time intervals which are already seen Time series model [Liu, 2015], [Lu, 2016], [Gao, 2017] Apply MF for each interval individually Learn an auto-regressive model that finds the linear correlation between the intervals models Extra complexity and requires a rich and long history of ratings Forgetting mechanism Old ratings are either discarded or down weighted A forgetting factor regulates the weights calculation 6

Time-aware Collaborative Topic Regression (T-CTR) Confidence scores in implicit feedback Confidence score for Regularization = a if R ui = 1 ; a b > 0 b otherwise Associate each rating with a confidence weight that controls the rating s importance 7

Time-aware Collaborative Topic Regression (T-CTR) How to set the ratings confidence weights? Old ratings should have less influence But Users might have different dynamics, some users tend to stick longer to the same topic. 6 Months i 1 i 2 i 6 i 3 i 7 Recommendation requested i 1 i 2 i 6 i 3 i 4 i 5 i 7 The paper age is not enough! i 2 and i 4 should not have the same confidence weight 8

User concept-drift score An individual concept drift score for each user How heterogeneous are the items in user ratings Pairwise similarity between each successive items Representative LDA Topics Sim Sim Sim Average Similarity User Concept-drift score S u = 1 Average Similarity 9

Ratings confidence weights Ratings confidence weights are decided by Rating age User s concept drift score Time decay function: 2 W ui = 1 + e S u(t t Rui ) ; T: is the current time, t Rui time of rating R ui Forgetting factor is the user confidence score 10

T-CTR Model Learning and prediction Set the confidence Scores C ui = max W ui, b if R ui = 1 b otherwise Learn the model parameters that maximize the log likelihood: Predictions: 11

Evaluation Experiments Dataset from citeulike ~ 3 K Users ~ 210 K Papers with titles, abstracts and keywords ~ 285 K Ratings From Nov 2004 to Dec 2007 12

Time-aware vs time-ignorant evaluation i 1 i 2 i 6 i 3 i 4 i 5 i 7 Time Time-ignorant Training set i 1 i 2 i 6 i 4 i 7 Test set i 3 i 5 Time-aware Training/Test data split i 1 i 6 i 2 i 4 i 3 i 7 i 5 13

Time-aware vs time-ignorant evaluation CTR performance measured on two setups: Time-aware evaluation Time-ignorant evaluation Time-aware evaluation are worst but realistic 14

T-CTR performance Baselines Collaborative Topic Regression (CTR) [Wang, 2011] Collaborative Evolution For User Profiling (CE) [Lu, 2016] Collaborative Filtering for Implicit Feedback (CF) [Hu 2008] Results 15

User-specific Concept-drift score importance Compare the following setups Common concept-drift score (CTR-0.1, CTR-0.5, CTR-1) User-specific Concept-drift score (T-CTR) Results 16

Conclusion & Future Work Conclusion Time-aware hybrid recommender system Dynamically adapt to the user dynamics in interest drift Study on a real-world dataset Time-aware vs time-ignorant offline evaluations Future Work Develop a probabilistic model that learns the users concept drift scores instead of the heuristic approach [http://www.superscholar.org] 17

Thanks for your attention Questions & Comments? alzoghba@informatik.uni-freiburg.de 18

References [Onur, 2015] Onur Küçüktunç, Erik Saule, Kamer Kaya, and Ümit V Çatalyürek. 2015. Diversifying citation recommendations. ACM Transactions on Intelligent Systems and Technology (TIST) 5, 4 (2015), 55. [Wang, 2011] Chong Wang and David M Blei. 2011. Collaborative topic modeling for recommending scientific articles. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 448 456. [Koren, 2010] Koren, Y.: Collaborative filtering with temporal dynamics. Communications of the ACM 53(4), 89-97 (2010) [Liu, 2015] Liu, X.: Modeling users' dynamic preference for personalized recommendation. In: Proceedings of the 24th International Conference on Artificial Intelligence, pp. 785-1791 (2015) [Gao, 2017] Gao, L., Wu, J., Zhou, C., Hu, Y.: Collaborative dynamic sparse topic regression with user prole evolution for item recommendation. In: AAAI Conference on Artificial Intelligence, pp. 1316{1322 (2017) [Hu, 2008] Hu, Y., Koren, Y., Volinsky, C.: Collaborative filtering for implicit feedback datasets. In: Data Mining, 2008. ICDM'08. Eighth IEEE International Conference on, pp. 263-272. Ieee (2008) [Lu, 2016] Lu, Z., Pan, S.J., Li, Y., Jiang, J., Yang, Q.: Collaborative evolution for user profiling in recommender systems. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pp. 3804-3810 (2016) 19