A survey on
Recommender Systems Mahdi Seyednezhad BioComplex Laboratory School of Computing Florida Institute of Technology May, 2017
[email protected] https://seyednejad.wixsite.com/home
At the end …
Why we need recommender systems?
At the end …
How do traditional methods work?
At the end …
How to evaluate recommender systems?
At the end …
How to use context in recommender systems?
Introduction
Outline
Methods
Evaluation
Context
Social media
At the end …
What is social information? How to use social information? 2
Why recommender systems? • Ancient human relied on advice from experts. • We are not expert in everything! • We need an expert for advice. • Computers change the global market.
Egypt agriculture
© http://www.crystalinks.com/egyptagriculture.html
3
Growth in the number of Internet Users Percentage that do not use the Internet
©ICT facts and figures
Mobile network coverage and evolving technologies
4
Clothes
Videos
Many Online Items Books
People Which one is better?????
Websites
Scientific papers Movies
Magazines
News
5
Many online companies Recommending various items
Recommending specific items
6
What is a recommender system (RS)? Items
Features
Recommender System
Ratings
Profile
Users RSs recommend items to users based on users’ preferences
Users
7
Recommender system challenges
Data acquisition • Explicitly • Implicitly
Cold start • New community • New user • New item
Data sparsity
• 99% of the user-item matrix elements have no value. • Number of users is much larger than the number of items. 8
Traditional recommender systems A recommender system needs to filter the information to extract the relevant items.
Demographic Filtering
Content-based Filtering (CBF)
Collaborative Filtering (CF)
9
Demographic filtering • It recommends items based on the demographic profile of users. • People from the same group tend to have the same taste. User
Gender Country
Age
Diego
M
Brazil
27
Josemar
M
Brazil
20
Pacheco
M
Brazil
45
Hugo
M
Brazil
36
Firas
M
Iraq
51
Priya
F
India
?
Marcos
M
Brazil
27
The Godfather
+ + + ?
10
Demographic filtering - Example • Diego, Josemar and Pacheco liked “The Godfather”. • How about Marcos? User
Gen.
Nationality
Age
Diego
M
Brazil
28
Josemar M
Brazil
20
Pacheco M
Brazil
45
Hugo
Brazil
36
Marcello M
Italy
30
Firas
M
Iraq
51
-
Priya
F
India
?
-
Marcos
M
Brazil
30
?
M
The Godfather
+ + + -
Probably
+
11
Content-based Filtering (CBF) • CBF method recommends items based on their description. • It consists of three major parts:
Content Analyzer
Profile Learner
Filter Components
• Pre-processing • Text to feature vector
• Find average content of items • Make prototype text vector
• Find similar documents • Filter out dissimilar 12
de Gemmis, Marco, et al. "Semantics-aware content-based recommender systems." Recommender Systems Handbook. Springer US, 2015. 119-159.
CBF- Example RS knows:
RS does not knows:
• Diego likes “The Godfather”.
• Does Diego like “Goodfellas”?
?
Make him an offer he can't refuse!
13
CBF- Example
Description Genre Crime
Description Genre Crime Thriller
Subject Drama Mafia
Subject Drama Mafia
Similar Diego likes “Goodfellas”. 14
CBF e–r Keyword-based
t n e t Con
z y l a An
TF-IDF:
Calculate similarity of documents. Filter out dissimilar documents.
TF: Term Frequency IDF: Inverse Document Frequency
terms that are frequently found in one text (TF), but rarely in other documents (IDF)
fk,j T F (tk , dj ) = max{fz,j } N IDF (tk ) = log nk
P
wk,i · wk,j pP sim(di , dj ) = pP 2 2 w · w k,i k,j k k k
Weight of term tk in document dj 15 Salton, G.: Automatic Text Processing. Addison-Wesley (1989)
Collaborative Filtering (CB) The Godfather Goodfellas Scarface
Heat
…
Casino
Diego
5
5
2
5
…
3
Priya
4
?
?
2
…
4
Harith
5
2
2
2
…
?
: :
…
…
…
…
…
…
Josemar
4
5
4
2
…
3
Pacheco
5
5
2
?
…
2
User-Item rating matrix
Items
users
User-item matrix in reality 16
CF - Example Diego likes “The Godfather.” Marcos likes “The Godfather” as well.
Diego likes “Scarface.” How about Marcos?
17
CF - Find similar items based on ratings! R: User x Item
Item-Item Similarity Neighborhood-based methods
Marcos likes Scarface.
User-User Similarity 18
CF - Predict ratings Item-Item Similarity
wi,j = qP
Set of users that rated items i and j
User-User Similarity
P
u2Ui,j (ru,i
wu,v = qP
P
i2Iu,v
(ru,i 2
r¯i ) ·
Pa,i = r¯a +
The average rating for item a
P
r¯i ) · (ru,j qP
r¯j )
u2Ui,j (ru,j
r¯j )
2
The average rating of user u
r¯u ) · (rv,i r¯v ) qP 2 r¯u ) · i2Iu,v (rv,i
(ru,i
i2Iu,v (ru,i
Set of items that has been rated by users u and v Predicted rating of user a for item i
u2Ui,j
Rating of user u for item i
2
r¯u ) · wa,u The weight between two u2U |wa,u |
(ru,i P
u2U
r¯v )
users a and u
Su, Xiaoyuan, and Taghi M. Khoshgoftaar. "A survey of collaborative filtering techniques." Advances in artificial intelligence 2009 (2009): 4.
19
Hybrid methods • In some application we use benefits of two or more recommendation methods. • Usually, other methods are combined with collaborative filtering. • For example, demographic filtering can help collaborative filtering in cold-start situations.
Bel-rrooh! Bel-ddam! Naftikah Godfather!
There is no rating from some Iraqi costumers: • Firas • Harith • Younes The Godfather is popular in Iraq, then it is recommended to them.
20
Hybrid methods - CF+CBF It combines CF and CBF with a weighting method. It may rank the items from both and recommend the top best items from them.
(a)
(b)
Recommendation
CF
CF
CBF
(c)
Recommendation
It uses CBF methods to extract features and send it to CF to make the final recommendation.
CBF
(d)
Model
CBF
Recommendation
Recommendation
CF
CBF
CF 21
Bobadilla, Jesús, et al. "Recommender systems survey." Knowledge-based systems 46 (2013): 109-132.
Hybrid methods - CF+CBF (a)
(b)
Recommendation
CF
CF
CBF
(c)
Recommendation
CBF
(d)
Model
CBF
Recommendation
Recommendation
A unified model is depicted that utilizes CF and CBF to have their output for another classifier, such as rule based classier or a probability model.
CF
CBF
CF The collaborative filter recommends items to CBF, and CBF works on them.
22 Bobadilla, Jesús, et al. "Recommender systems survey." Knowledge-based systems 46 (2013): 109-132.
Evaluation Prediction accuracy
Quality of the list of items
Quality of the set of items
23
Prediction accuracy Mean Absolute Error
1 X M AE = #U
u2U
1 X Root Mean Square Error RM SE = #U
u2U
1 #Ou
s
X
i2Ou
1 #Ou
|pu,i
X
i2Ou
ru,i |
O : set of rated items
!
U : set of users Ou : set of items rated by user u
(pu,i
ru,i )
2
pu,i : prediction of the ratings of user u for item i ru,i : ratings of user u for item i #{*} : cardinality of set {*}
24
Prediction accuracy - Coverage Coverage for user u: The percentage of situations in which at least one k-neighbor of user u can rate an item that has not been rated by user u. Total coverage : Average of the coverage for each user • It could be defined as capacity of prediction. • The percentage of a dataset that the recommender system can make prediction. Cu : set of items that have not been rated by user u and at least one of the neighbors rated it Du : set of items that have not been rated by user u ✓ ◆ X X 1 1 #Cu coverage = 100 ⇥ #U #Ou #Du All rated items = Ou U Du u2U
Rated At least one of by the user u’s user u neighbors rated these items Ou
i2Ou
Cu Du
25
Set quality For some users, having a set of items recommended is very important. There should be some methods that evaluate the quality of the set of recommended items. Precision = X/Y
Recall=X/Z
X=number of relevant items recommended
Y=n=number of recommended items
Z=number of relevant items All possible items 26
Set quality Zu
Precision = X/Y
Recall=X/Z
X=number of relevant items recommended
U : set of users Zu : set of relevant items recommended to user u Zcu : set of relevant items not recommended to user u n : number of recommended items (size of the set of recommendations)
Y=n=number of recommended items
The percentage of relevant items among recommended items.
Z=number of relevant items
All possible items items
Z cu
The percentage of recommended items among relevant items.
27 Bobadilla, Jesús, et al. "Recommender systems survey." Knowledge-based systems 46 (2013): 109-132.
List quality • Users lose attention to the following items in the list drastically. • In some applications, items on top of the list are very important. • Half-life (Hl) is one of the metrics to evaluate the quality of recommendation list.
N X X 1 max(ru,pi d, 0) Hl = (i 1)/(↵ 1) #U 2 u2U i=1
28
Now, we know about: • Traditional methods • How to evaluate recommender systems Let us step further and use more information to improve recommender systems.
re o M
i
rm o nf
a
n o i t
Evaluations Traditional methods
Why recommender systems? 29
Context • What is context? • Webster’s: ❑ The interrelated conditions in which something exists or occurs.
• The concept of context is controversial. • Interaction: ➡ Time ➡ Location • Feature: ➡ Type of actors Reed Hasting, the CEO of Netflix, claimed that they can improve the performance of their recommender system up to 3% when considering such contextual information. 30
Example of context - Time • What should be recommended to Pacheco if he wants to watch a movie on Saturday?
• Pacheco watched these movies and TV series and liked them: Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Including Sunday
31
Example of context - Features • Pacheco likes “The Godfather I and II”.
• Does Pacheco like “Heat”?
Context-based recommender systems: Pacheco likes it, because they have popular academy award winners in common.
32
Context in recommender systems Traditional methods: R : U ser ⇥ Items ! Rating Add context to recommender systems:
R : U ser ⇥ Items ⇥ Context ! Rating
Hierarchical representation
Tensor representation
If we consider Time as context:
Gr an
E-retailer DB Personal
Gift
k1
Work k2
ul
ar
to
k1
Other k2
Partner Friend 2
User
co ar se
Item
Time As context
Parent Other 2
k
k
UName
Partner
Friend
Parent
k3
k3
k3
Address
Age
IName
Type
Price
Year
Month
Day
Other k3
Palmisano, et al. "Using context to improve predictive modeling of customers in personalization applications." IEEE transactions on knowledge and data engineering 20.11 (2008): 1535-1549.
33
Obtaining contextual information • Explicitly ▪ The information is gained directly from entities. ▪ the information of location or time can be extracted from the users' device.
• Implicitly ▪ It needs a monitoring system to observe the users and interactions. ▪ The source of information is accessed directly.
• Inferring ▪ RS should infer information from other data that has already been extracted. ▪ The information here is hidden and requires special algorithms to be revealed.
34
Data UxIxCxR
Utilizing context – Pre-filtering How?
C
1. Use contextual information to filter the relevant data.
Contextualized Data UxIxR
2. Feed the 2D (User x Item) to a traditional method. It uses context to filter out irrelevant data:
2D Recommender UxI R
• Exact filtering ❖ Example: Find data about ratings on Saturdays
× Drawback: The filtered information is too narrow. • Aggregation ❖ Example: Find data about ratings on the weekends
× Drawback: We don’t know how much aggregation we need.
u
Contextual Recommender i1,i2,i3 ,...
Gediminas Adomavicius and Alexander Tuzhilin,. Context-aware recommender systems. In Recommender systems handbook, p. 191-226. Springer, 2015.
35
Data UxIxCxR
Utilizing context – Post-filtering How? 1. Find the recommendation based on items and users. 2. Adjust the final list of recommendations based on context. • Filter our the items that do not satisfies the context.
2D Recommender UxI R
• Reorder or rank the list with respect to the degree they match the context. u
Recommendations i1,i2,i3 ,... C Contextual Recommender i1,i2,i3 ,... Gediminas Adomavicius and Alexander Tuzhilin,. Context-aware recommender systems. In Recommender systems handbook, p. 191-226. Springer, 2015.
36
Utilizing context – Modelling
Data UxIxCxR
How? • This method uses the 3D data. • We can use a similarity function to predict the unknown ratings. ru,I,k : rating of user u for item i regarding context c k : normalization factor
ru,i,c = k
X
(u0 ,i0 ,c0 )6=(u,i,c)
W ((u0 , i0 , c0 ), (u, i, c)) ⇥ ru0 ,i0 ,c0
W : weight of participating in predicting the unknown rating
W ((u0 , i0 , c0 ), (u, i, c)) /
1 distance ((u0 , i0 , c0 ), (u, i, c))
Gediminas Adomavicius and Alexander Tuzhilin,. Context-aware recommender systems. In Recommender systems handbook, p. 191-226. Springer, 2015.
MD Recommender UxIxC R u C
Contextual Recommender i1,i2,i3 ,...
37
How about Josemar on Twitter? Josemar follows Diego, Marcos, Pacheco and Firas on Twitter, and Diego follows him. There is no information about Josemar’s ratings or preferences.
Pacheco Diego
Josemar
Marcos
Firas
38
Social-based recommender systems Social networks are popular.
User profiling is improved by social information.
Why? Items are available through social networks.
It is a powerful source of information.
Friend have similar taste. 39
Content in social media Content can be recommended. Content can be used to improve recommendation. • Content in social media: o Blog
o News
o Multimedia
o Job
o Question & Answer
o Microblog
40 Ido Guy, Social recommender systems. In Recommender systems handbook, p. 511-543. Springer, 2015.
Example - Movie recommendation • Carrer-Neto et al. use the information extracted from the profile of users • Social aperture • Moderate ✓ Use (25% friends’ ratings + 75% user ratings)
• Liberal ✓ Use (50% friends’ ratings + 50% user ratings)
• Conservative ✓ Use (user ratings) Carrer-Neto, Walter, et al. "Social knowledge-based recommender system. Application to the movies domain." Expert Systems with applications 39.12 (2012): 10990-11000.
41
Example – People recommendation
WTF Who To Follow algorithm:
1. Find the circle of trust (CoT). 2. Create a bipartite graph of individuals from the CoT. 3. Run Twitter's Money algorithm to find the relevant people. 4. Recommend top relevant people. Geil, Afton et al., "WTF, GPU! computing twitter's who-to-follow on the GPU." Proceedings of the second ACM conference on Online social networks. ACM, 2014.
42
Immediate friend inference Item i has set of attributes b1
b2
…
bn
What is the rating of user u for item i given set of attributes of item i and set of attributes of user u and neighbors of the user u Naive Bayesian assumption
User u has set of attributes a1
a2
am
…
User u has some neighbors
User preference Probability of rating k of user u given set of item attributes bi
Item acceptance Probability of rating k for item i given set of user attributes ai
Neighbor preference Probability of rating k of user u given neighbors Nu
Nu user u 43
Immediate friend inference Probability of rating k of user u for item i (Ru,i), given the set of attributes au of user, set of attributes bi of item and the rating of the neighbors.
Set of users that rated for item i
P (Ru,i = k|B = bi , A = au , {Rv,i = rv,i : 8v 2 Ui \ Nu }) 1 = P (Ru,i = k|B = bi ) ⇥ P (Ru,i = k|A = au ) Z ⇥P (Ru,i = k|{Rv,i = rv,i : 8v 2 Ui \ Nu }) User preference : probability of a rating k of user u given set of item attributes bi
Item acceptance
Neighbors of user u
Preference of neighbors of user u Jianming He and Wesley W. Chu, A social network-based recommender systems. In Data Mining for Social Network Data handbook, p. 47-74. Springer, 2010.
44
Example User preference What is my rating if Al Pacino plays in a movie?
Item acceptance What is the movie’s rating if somebody like user u watches it? Neighbor preference What is my friends’ rating if Al Pacino plays in a movie?
45
Immediate friend inference User Preference
Item Acceptance
P (Ru = k|B = bi ) =
P (Ri = k|A = ai ) =
Neighbor influence
Qj=n
P (Bj |Ru = k)
Qj=m
P (Aj |Ri = k)
P (Ru = k) ⇥
j=1
P (B1 , B2 , . . . , Bn )
P (Ri = k) ⇥
j=1
P (A1 , A2 , . . . , An )
, Bj 2 {B1 , B2 , . . . , Bn }
, Aj 2 {A1 , A2 , . . . , Am }
P (Ru,i = k|Rv,i = rv,i ) / H(k
rv,i )
Histogram of the difference between user’s rating and the neighbors’ rating 46
Summary
Traditional methods Demographic filtering Content based filtering Collaborative filtering Hybrid methods
Evaluation Accuracy Set quality List Quality
Context-based What is context? Obtaining context Utilizing context Pre-filtering Post-filtering Modelling
Social-based Why social? Content in social media People recommendation Immediate friend inference