Recommender Systems

A survey on

Recommender Systems Mahdi Seyednezhad BioComplex Laboratory School of Computing Florida Institute of Technology May, 2017

[email protected] https://seyednejad.wixsite.com/home

At the end …

Why we need recommender systems?

At the end …

How do traditional methods work?

At the end …

How to evaluate recommender systems?

At the end …

How to use context in recommender systems?

Introduction

Outline

Methods

Evaluation

Context

Social media

At the end …

What is social information? How to use social information? 2

Why recommender systems? • Ancient human relied on advice from experts. • We are not expert in everything! • We need an expert for advice. • Computers change the global market.

Egypt agriculture

© http://www.crystalinks.com/egyptagriculture.html

3

Growth in the number of Internet Users Percentage that do not use the Internet

©ICT facts and figures

Mobile network coverage and evolving technologies

4

Clothes

Videos

Many Online Items Books

People Which one is better?????

Websites

Scientific papers Movies

Magazines

News

5

Many online companies Recommending various items

Recommending specific items

6

What is a recommender system (RS)? Items

Features

Recommender System

Ratings

Profile

Users RSs recommend items to users based on users’ preferences

Users

7

Recommender system challenges

Data acquisition • Explicitly • Implicitly

Cold start • New community • New user • New item

Data sparsity

• 99% of the user-item matrix elements have no value. • Number of users is much larger than the number of items. 8

Traditional recommender systems A recommender system needs to filter the information to extract the relevant items.

Demographic Filtering

Content-based Filtering (CBF)

Collaborative Filtering (CF)

9

Demographic filtering • It recommends items based on the demographic profile of users. • People from the same group tend to have the same taste. User

Gender Country

Age

Diego

M

Brazil

27

Josemar

M

Brazil

20

Pacheco

M

Brazil

45

Hugo

M

Brazil

36

Firas

M

Iraq

51

Priya

F

India

?

Marcos

M

Brazil

27

The Godfather

+ + + ?

10

Demographic filtering - Example • Diego, Josemar and Pacheco liked “The Godfather”. • How about Marcos? User

Gen.

Nationality

Age

Diego

M

Brazil

28

Josemar M

Brazil

20

Pacheco M

Brazil

45

Hugo

Brazil

36

Marcello M

Italy

30

Firas

M

Iraq

51

-

Priya

F

India

?

-

Marcos

M

Brazil

30

?

M

The Godfather

+ + + -

Probably

+

11

Content-based Filtering (CBF) • CBF method recommends items based on their description. • It consists of three major parts:

Content Analyzer

Profile Learner

Filter Components

• Pre-processing • Text to feature vector

• Find average content of items • Make prototype text vector

• Find similar documents • Filter out dissimilar 12

de Gemmis, Marco, et al. "Semantics-aware content-based recommender systems." Recommender Systems Handbook. Springer US, 2015. 119-159.

CBF- Example RS knows:

RS does not knows:

• Diego likes “The Godfather”.

• Does Diego like “Goodfellas”?

?

Make him an offer he can't refuse!

13

CBF- Example

Description Genre Crime

Description Genre Crime Thriller

Subject Drama Mafia

Subject Drama Mafia

Similar Diego likes “Goodfellas”. 14

CBF e–r Keyword-based

t n e t Con

z y l a An

TF-IDF:

Calculate similarity of documents. Filter out dissimilar documents.

TF: Term Frequency IDF: Inverse Document Frequency

terms that are frequently found in one text (TF), but rarely in other documents (IDF)

fk,j T F (tk , dj ) = max{fz,j } N IDF (tk ) = log nk

P

wk,i · wk,j pP sim(di , dj ) = pP 2 2 w · w k,i k,j k k k

Weight of term tk in document dj 15 Salton, G.: Automatic Text Processing. Addison-Wesley (1989)

Collaborative Filtering (CB) The Godfather Goodfellas Scarface

Heat

…

Casino

Diego

5

5

2

5

…

3

Priya

4

?

?

2

…

4

Harith

5

2

2

2

…

?

: :

…

…

…

…

…

…

Josemar

4

5

4

2

…

3

Pacheco

5

5

2

?

…

2

User-Item rating matrix

Items

users

User-item matrix in reality 16

CF - Example Diego likes “The Godfather.” Marcos likes “The Godfather” as well.

Diego likes “Scarface.” How about Marcos?

17

CF - Find similar items based on ratings! R: User x Item

Item-Item Similarity Neighborhood-based methods

Marcos likes Scarface.

User-User Similarity 18

CF - Predict ratings Item-Item Similarity

wi,j = qP

Set of users that rated items i and j

User-User Similarity

P

u2Ui,j (ru,i

wu,v = qP

P

i2Iu,v

(ru,i 2

r¯i ) ·

Pa,i = r¯a +

The average rating for item a

P

r¯i ) · (ru,j qP

r¯j )

u2Ui,j (ru,j

r¯j )

2

The average rating of user u

r¯u ) · (rv,i r¯v ) qP 2 r¯u ) · i2Iu,v (rv,i

(ru,i

i2Iu,v (ru,i

Set of items that has been rated by users u and v Predicted rating of user a for item i

u2Ui,j

Rating of user u for item i

2

r¯u ) · wa,u The weight between two u2U |wa,u |

(ru,i P

u2U

r¯v )

users a and u

Su, Xiaoyuan, and Taghi M. Khoshgoftaar. "A survey of collaborative filtering techniques." Advances in artificial intelligence 2009 (2009): 4.

19

Hybrid methods • In some application we use benefits of two or more recommendation methods. • Usually, other methods are combined with collaborative filtering. • For example, demographic filtering can help collaborative filtering in cold-start situations.

Bel-rrooh! Bel-ddam! Naftikah Godfather!

There is no rating from some Iraqi costumers: • Firas • Harith • Younes The Godfather is popular in Iraq, then it is recommended to them.

20

Hybrid methods - CF+CBF It combines CF and CBF with a weighting method. It may rank the items from both and recommend the top best items from them.

(a)

(b)

Recommendation

CF

CF

CBF

(c)

Recommendation

It uses CBF methods to extract features and send it to CF to make the final recommendation.

CBF

(d)

Model

CBF

Recommendation

Recommendation

CF

CBF

CF 21

Bobadilla, Jesús, et al. "Recommender systems survey." Knowledge-based systems 46 (2013): 109-132.

Hybrid methods - CF+CBF (a)

(b)

Recommendation

CF

CF

CBF

(c)

Recommendation

CBF

(d)

Model

CBF

Recommendation

Recommendation

A unified model is depicted that utilizes CF and CBF to have their output for another classifier, such as rule based classier or a probability model.

CF

CBF

CF The collaborative filter recommends items to CBF, and CBF works on them.

22 Bobadilla, Jesús, et al. "Recommender systems survey." Knowledge-based systems 46 (2013): 109-132.

Evaluation Prediction accuracy

Quality of the list of items

Quality of the set of items

23

Prediction accuracy Mean Absolute Error

1 X M AE = #U

u2U

1 X Root Mean Square Error RM SE = #U

u2U

1 #Ou

s

X

i2Ou

1 #Ou

|pu,i

X

i2Ou

ru,i |

O : set of rated items

!

U : set of users Ou : set of items rated by user u

(pu,i

ru,i )

2

pu,i : prediction of the ratings of user u for item i ru,i : ratings of user u for item i #{*} : cardinality of set {*}

24

Prediction accuracy - Coverage Coverage for user u: The percentage of situations in which at least one k-neighbor of user u can rate an item that has not been rated by user u. Total coverage : Average of the coverage for each user • It could be defined as capacity of prediction. • The percentage of a dataset that the recommender system can make prediction. Cu : set of items that have not been rated by user u and at least one of the neighbors rated it Du : set of items that have not been rated by user u ✓ ◆ X X 1 1 #Cu coverage = 100 ⇥ #U #Ou #Du All rated items = Ou U Du u2U

Rated At least one of by the user u’s user u neighbors rated these items Ou

i2Ou

Cu Du

25

Set quality For some users, having a set of items recommended is very important. There should be some methods that evaluate the quality of the set of recommended items. Precision = X/Y

Recall=X/Z

X=number of relevant items recommended

Y=n=number of recommended items

Z=number of relevant items All possible items 26

Set quality Zu

Precision = X/Y

Recall=X/Z

X=number of relevant items recommended

U : set of users Zu : set of relevant items recommended to user u Zcu : set of relevant items not recommended to user u n : number of recommended items (size of the set of recommendations)

Y=n=number of recommended items

The percentage of relevant items among recommended items.

Z=number of relevant items

All possible items items

Z cu

The percentage of recommended items among relevant items.

27 Bobadilla, Jesús, et al. "Recommender systems survey." Knowledge-based systems 46 (2013): 109-132.

List quality • Users lose attention to the following items in the list drastically. • In some applications, items on top of the list are very important. • Half-life (Hl) is one of the metrics to evaluate the quality of recommendation list.

N X X 1 max(ru,pi d, 0) Hl = (i 1)/(↵ 1) #U 2 u2U i=1

28

Now, we know about: • Traditional methods • How to evaluate recommender systems Let us step further and use more information to improve recommender systems.

re o M

i

rm o nf

a

n o i t

Evaluations Traditional methods

Why recommender systems? 29

Context • What is context? • Webster’s: ❑ The interrelated conditions in which something exists or occurs.

• The concept of context is controversial. • Interaction: ➡ Time ➡ Location • Feature: ➡ Type of actors Reed Hasting, the CEO of Netflix, claimed that they can improve the performance of their recommender system up to 3% when considering such contextual information. 30

Example of context - Time • What should be recommended to Pacheco if he wants to watch a movie on Saturday?

• Pacheco watched these movies and TV series and liked them: Sunday

Monday

Tuesday

Wednesday

Thursday

Friday

Saturday

Including Sunday

31

Example of context - Features • Pacheco likes “The Godfather I and II”.

• Does Pacheco like “Heat”?

Context-based recommender systems: Pacheco likes it, because they have popular academy award winners in common.

32

Context in recommender systems Traditional methods: R : U ser ⇥ Items ! Rating Add context to recommender systems:

R : U ser ⇥ Items ⇥ Context ! Rating

Hierarchical representation

Tensor representation

If we consider Time as context:

Gr an

E-retailer DB Personal

Gift

k1

Work k2

ul

ar

to

k1

Other k2

Partner Friend 2

User

co ar se

Item

Time As context

Parent Other 2

k

k

UName

Partner

Friend

Parent

k3

k3

k3

Address

Age

IName

Type

Price

Year

Month

Day

Other k3

Palmisano, et al. "Using context to improve predictive modeling of customers in personalization applications." IEEE transactions on knowledge and data engineering 20.11 (2008): 1535-1549.

33

Obtaining contextual information • Explicitly ▪ The information is gained directly from entities. ▪ the information of location or time can be extracted from the users' device.

• Implicitly ▪ It needs a monitoring system to observe the users and interactions. ▪ The source of information is accessed directly.

• Inferring ▪ RS should infer information from other data that has already been extracted. ▪ The information here is hidden and requires special algorithms to be revealed.

34

Data UxIxCxR

Utilizing context – Pre-filtering How?

C

1. Use contextual information to filter the relevant data.

Contextualized Data UxIxR

2. Feed the 2D (User x Item) to a traditional method. It uses context to filter out irrelevant data:

2D Recommender UxI R

• Exact filtering ❖ Example: Find data about ratings on Saturdays

× Drawback: The filtered information is too narrow. • Aggregation ❖ Example: Find data about ratings on the weekends

× Drawback: We don’t know how much aggregation we need.

u

Contextual Recommender i1,i2,i3 ,...

Gediminas Adomavicius and Alexander Tuzhilin,. Context-aware recommender systems. In Recommender systems handbook, p. 191-226. Springer, 2015.

35

Data UxIxCxR

Utilizing context – Post-filtering How? 1. Find the recommendation based on items and users. 2. Adjust the final list of recommendations based on context. • Filter our the items that do not satisfies the context.

2D Recommender UxI R

• Reorder or rank the list with respect to the degree they match the context. u

Recommendations i1,i2,i3 ,... C Contextual Recommender i1,i2,i3 ,... Gediminas Adomavicius and Alexander Tuzhilin,. Context-aware recommender systems. In Recommender systems handbook, p. 191-226. Springer, 2015.

36

Utilizing context – Modelling

Data UxIxCxR

How? • This method uses the 3D data. • We can use a similarity function to predict the unknown ratings. ru,I,k : rating of user u for item i regarding context c k : normalization factor

ru,i,c = k

X

(u0 ,i0 ,c0 )6=(u,i,c)

W ((u0 , i0 , c0 ), (u, i, c)) ⇥ ru0 ,i0 ,c0

W : weight of participating in predicting the unknown rating

W ((u0 , i0 , c0 ), (u, i, c)) /

1 distance ((u0 , i0 , c0 ), (u, i, c))

Gediminas Adomavicius and Alexander Tuzhilin,. Context-aware recommender systems. In Recommender systems handbook, p. 191-226. Springer, 2015.

MD Recommender UxIxC R u C

Contextual Recommender i1,i2,i3 ,...

37

How about Josemar on Twitter? Josemar follows Diego, Marcos, Pacheco and Firas on Twitter, and Diego follows him. There is no information about Josemar’s ratings or preferences.

Pacheco Diego

Josemar

Marcos

Firas

38

Social-based recommender systems Social networks are popular.

User profiling is improved by social information.

Why? Items are available through social networks.

It is a powerful source of information.

Friend have similar taste. 39

Content in social media Content can be recommended. Content can be used to improve recommendation. • Content in social media: o Blog

o News

o Multimedia

o Job

o Question & Answer

o Microblog

40 Ido Guy, Social recommender systems. In Recommender systems handbook, p. 511-543. Springer, 2015.

Example - Movie recommendation • Carrer-Neto et al. use the information extracted from the profile of users • Social aperture • Moderate ✓ Use (25% friends’ ratings + 75% user ratings)

• Liberal ✓ Use (50% friends’ ratings + 50% user ratings)

• Conservative ✓ Use (user ratings) Carrer-Neto, Walter, et al. "Social knowledge-based recommender system. Application to the movies domain." Expert Systems with applications 39.12 (2012): 10990-11000.

41

Example – People recommendation

WTF Who To Follow algorithm:

1. Find the circle of trust (CoT). 2. Create a bipartite graph of individuals from the CoT. 3. Run Twitter's Money algorithm to find the relevant people. 4. Recommend top relevant people. Geil, Afton et al., "WTF, GPU! computing twitter's who-to-follow on the GPU." Proceedings of the second ACM conference on Online social networks. ACM, 2014.

42

Immediate friend inference Item i has set of attributes b1

b2

…

bn

What is the rating of user u for item i given set of attributes of item i and set of attributes of user u and neighbors of the user u Naive Bayesian assumption

User u has set of attributes a1

a2

am

…

User u has some neighbors

User preference Probability of rating k of user u given set of item attributes bi

Item acceptance Probability of rating k for item i given set of user attributes ai

Neighbor preference Probability of rating k of user u given neighbors Nu

Nu user u 43

Immediate friend inference Probability of rating k of user u for item i (Ru,i), given the set of attributes au of user, set of attributes bi of item and the rating of the neighbors.

Set of users that rated for item i

P (Ru,i = k|B = bi , A = au , {Rv,i = rv,i : 8v 2 Ui \ Nu }) 1 = P (Ru,i = k|B = bi ) ⇥ P (Ru,i = k|A = au ) Z ⇥P (Ru,i = k|{Rv,i = rv,i : 8v 2 Ui \ Nu }) User preference : probability of a rating k of user u given set of item attributes bi

Item acceptance

Neighbors of user u

Preference of neighbors of user u Jianming He and Wesley W. Chu, A social network-based recommender systems. In Data Mining for Social Network Data handbook, p. 47-74. Springer, 2010.

44

Example User preference What is my rating if Al Pacino plays in a movie?

Item acceptance What is the movie’s rating if somebody like user u watches it? Neighbor preference What is my friends’ rating if Al Pacino plays in a movie?

45

Immediate friend inference User Preference

Item Acceptance

P (Ru = k|B = bi ) =

P (Ri = k|A = ai ) =

Neighbor influence

Qj=n

P (Bj |Ru = k)

Qj=m

P (Aj |Ri = k)

P (Ru = k) ⇥

j=1

P (B1 , B2 , . . . , Bn )

P (Ri = k) ⇥

j=1

P (A1 , A2 , . . . , An )

, Bj 2 {B1 , B2 , . . . , Bn }

, Aj 2 {A1 , A2 , . . . , Am }

P (Ru,i = k|Rv,i = rv,i ) / H(k

rv,i )

Histogram of the difference between user’s rating and the neighbors’ rating 46

Summary

Traditional methods Demographic filtering Content based filtering Collaborative filtering Hybrid methods

Evaluation Accuracy Set quality List Quality

Context-based What is context? Obtaining context Utilizing context Pre-filtering Post-filtering Modelling

Social-based Why social? Content in social media People recommendation Immediate friend inference