Paper Title (use style: paper title)

1 downloads 53 Views 360KB Size Report
recommendation methods always consider semantic similarity between hashtags and tweets ... enterprise influence through social media marketing and so on.
Recommending Hashtags to Forthcoming Tweets in Microblogging Shuangyong Song, Yao Meng, Zhongguang Zheng Internet Application Laboratory, Fujitsu R&D Center Co., Ltd. Beijing 100025, China. {shuangyong.song, mengyao, zhengzhg}@cn.fujitsu.com

Abstract—Over the last few years, microblogging is increasingly becoming an important platform for users to acquire information and publish some reviews and personal status. In microblogging, hashtags mean some topic words between two ‘#’, such as some social events or some hot topics. Hashtags can highlight the topic of tweets, and make tweets be easily searched and understood by others. Therefore, many users like adding hashtags for their tweets. Existing hashtag recommendation methods always consider semantic similarity between hashtags and tweets as the only key factor. However, the hashtags’ user acceptance degree and development tendency are two important factors for evaluate the recommendation probability of them. In this paper, we propose a model for providing some related hashtags for users to choose one or more of them as content added into forthcoming tweets. The above three factors have been considered to complete this task, which are the semantic similarity between a hashtag and a tweet, the user acceptance degree of the hashtag, and the development tendency of the hashtag. Experimental results on a dataset from Sina Weibo, one of the largest Chinese microblogging websites, show usefulness of the proposed model for recommending hashtags to forthcoming tweets.

contribute similar content or express a related idea. From the above discussion, hashtags can highlight the topic of tweets, and make those tweets be easily searched and understood by other users [2]. Therefore, many users like adding hashtags for their tweets.

Keywords-microblogging; forthcoming tweets; web service; hashtag recommendation; topic model; hashtag development tendency detection

Existing methods for hashtag recommendation usually unilaterally consider the semantic similarity between hashtags and tweets as the key factor, and it has been proved quite effective for the hashtag recommendation task [3, 4, and 5]. The semantic similarity factor can help detect the most content-relevant hashtags for a forthcoming tweet, and those hashtags can highlight the topic. We also take the semantic similarity as a basic factor for this task, and further consider other two factors: user acceptance degree of a hashtag, and development tendency of a hashtag, which are useful for improving the probability to be searched and then help increase its diffusion. For example, a user writes a tweet about the ‘777 airplane crash’, while ‘#Boeing_777_crash’ and ‘#Asiana_Airlines_Boeing_777_crash’ are both contentrelevant hashtags, and have high semantic similarity with the tweet. However, because ‘#Boeing_777_crash’ is shorter than ‘#Asiana_Airlines_Boeing_777_crash’, so the former one has 77325 tweets which contain it as hashtags, and the latter one just has 636 tweets. So ‘#Boeing_777_crash’ should be given more probability to recommend to the tweet. Besides, because this event taken place on July, 2013, so two hashtags all have low development tendency now, which

I.

INTRODUCTION

With the rapid development of the Internet, social media has played an increasingly significant role in our everyday lives, providing reports on world events, improving enterprise influence through social media marketing and so on. Microblogging has become one of the most popular social media platforms where users can share their daily activities, exchange opinions, publish posts on some trending topics and follow others to get relevant information about their interested topics. A particular and important feature of microblogging is the hashtag, a short-hand convention adopted by microblogging users to manually assign their posts to a wider corpus of messages on the same topic. They are denoted with the # symbol preceding a short string, often a name or abbreviation [1], such as ‘#The_big_bang_theory’, ‘#Michael_Jackson’ and ‘#computer_hardware_engineering’. In microblogging, hashtags are used to be brief topical markers, and they are usually adopted by users that Identify applicable sponsor/s here. (sponsors)

Even though the hashtags are so useful, not all tweets are marked with hashtags. Actually, just less than 20% of tweets have been published with hashtags based on our incomplete statistics. One possible reason is that defining hashtags is harder than writing tweets, and another possible reason is that the user may not know if a self-defined hashtag can help increase the readability and topicality of a forthcoming tweet. Furthermore, users may prefer to choose some existing hashtags, which have been used many times and accepted by lots of users, as choices for a forthcoming tweet. For example, even for a tweet like ‘I am going to bed~ ’, there is hashtags such as ‘#Good_evening’ can be taken as a selectable hashtag, which has been used more than 17 million times. Therefore, how to automatically generate or recommend hashtags has become an important research topic, which can help users more easily to get useful hashtags for their forthcoming tweets.

make two hashtags have similar development tendency. To sum up, ‘#Boeing_777_crash’ will be given higher ranking than ‘#Asiana_Airlines_Boeing_777_crash’. To the best of our knowledge, no other work on recommending hashtags to forthcoming tweets has considered those two factors. In this paper, we propose a novel hashtag recommendation model with considering all the above mentioned factors. First, we train a topic model with abounding tweets for getting the topic vector of both hashtags and forthcoming tweets, which are used to calculate the semantic similarity between them, and then we detect the user acceptance degree and development tendency of each hashtag. Finally, we rank those hashtags with the three factors, and the top ranked ones will be recommended to the forthcoming tweets’ users. The remainder of this paper is organized as follows. In Section 2, some related work is discussed. Section 3 introduces the proposed model. The detailed experimental analysis are presented in Section 4. Finally, we conclude the paper and present some directions for future work in Section 5. II.

LITERATURE REVIEW

The model presented in this paper is intended for recommending hashtags to forthcoming tweets, on the basis of multiple hashtag related factors. Our work is related to the work on content based hashtag recommendation and development tendency detection of time series. In this section, we discuss those related work separately. A. Content Based Hashtag Recommendation Researches on microblogging hashtag recommendation systems have attracted some interest, existing researches mainly focused on the analysis of semantic similarity between hashtags and tweets, for detecting the implicit sematic relationship between them. For example, Ding et al. adopted a topic-specific translation model (TSTM) to detect

the topic-specific word-tag alignment between hashtags and tweet words, which combines the advantages of both topic model and translation model, for discovering whether a hashtag has possible content link to the given tweet [3]. Sedhai and Sun focus on the task of recommending hashtags for hyperlinked tweets, which have links to web pages, with considering the linked web page content as additional information to enrich the tweet content [4]. She and Chen treated hashtags as labels of topics, and develop a supervised topic model to discover relationship among words, hashtags and topics of tweets, and then they calculate the semantic similarity between hashtags and tweets with the topic model based results, for recommending content related hashtags [5]. However, except for the semantic similarity factors, little work has been done to consider the hashtag popularity as key factor in the hashtag recommendation task, which actually is a very important factor to evaluate the helpfulness of a hashtag to the diffusion of a forthcoming tweet in the future. B. Development Tendency Detection of Time Series Development tendency detection of time series has also attract some researchers’ interest. Mark Last proposed a system called OLIN for On-Line Information Network, which adapts itself automatically to the rate of concept drift in a non-stationary data stream by dynamically adjusting the size of the training window and the rate of model update. He applied signal-processing techniques to partition each series of daily stock values into a sequence of intervals having distinct slopes, as their development trends [15]. Kleinberg proposed a burst detection technique for discovering ‘burst of activity’ in a document stream, which means the appearance of a topic in this stream. The approach is based on modeling the stream using an infinite-state automaton, in which bursts appear naturally as state transitions, which can be viewed as drawing an analogy with models from queuing theory for bursty network traffic [16]. In this paper, we also

Tweets dataset

forthcoming tweet

Tweet topic vector

Hashtag ranking list

Content similarity

Hashtag pool

Topic Model Hashtag topic vectors Hashtag features

Related tweet number Development tendency

Online Part

Offline Part

Figure 1. System architecture of the proposed model.

need to detect the development tendency of hashtag diffusion. However, above two discussed methods are all not applied to our task, since our time series of hashtags are dispersed, and the method in [15] can just be used for continuous data stream. Besides, we need to get a real-value to represent the value of hashtags’ development tendency, and the method in [16] can just give out an integral ‘bursty state’ value. Therefore, we design a simple yet effective development tendency detection method to handle this task, with utilizing Polynomial Spline Estimator and Sigmoid Function, which will be described in the following section. III.

PROPOSED MODEL FOR RECOMMENDING HASHTAGS

A. System Architecture of the Proposed Model In this section, we present the architectural design of the proposed hashtag recommendation model. The overview of our system architecture is shown in Figure 1, which consists of two separated parts, namely, the offline part and the online part. The offline part contains three subparts, which are topic model training, hashtag related tweet number detection and hashtag development tendency detection. In topic model training module, we train topic model with huge amount of tweets data, for converting tweets to be topic vectors. In hashtag related tweet number detection module, we get the number of a hashtag related tweets with the search tool on the microblogging site. In hashtag development tendency detection module, we infer the hashtag development curve and then detect its current development status with the slope of the curve. The online part contains two subparts, which are hashtag weight calculation and hashtag ranking. In hashtag weight calculation module, we comprehensively consider three factors to evaluate the recommendation value of a hashtag. In hashtag ranking module, we rank hashtags with their recommendation values, and the top ranked ones will be recommended to the forthcoming tweets’ users. The mechanism of each functional module in our proposed model is discussed in detail in the following sections. B. Offline Part 1) Topic model training In supervised dimension reduction (SDR), we are asked to find a low-dimensional space which preserves the predictive information of the response variable [17]. Topic modeling is a potential approach to dimension reduction, which is a kind of methods to modeling implicit topics in text corpus. Generally, topic model can be used to realize semantic dimension reduction of word vector, from thousands of dimensions to dozens of dimensions, and recent advances in this new area can deal well with huge data of very high dimensions [8]. Common topic models contain LDA (Latent Dirichlet Allocation) model, LSA (Latent Semantic Analysis) model and pLSA (probabilitistic Latent Semantic Analysis), etc. In this paper, we utilize JGibbLDA version LDA topic model [14] to realize the dimension reduction on word vectors of tweets, since the ‘microblog word’ matrix is very sparse and hard to be well analyzed [2]. We firstly do the Chinese word segmentation on the tweets, with the Chinese analysis tool ICTCLAS [9], and then

dedup the words to be a word vector. After convert each tweet to be a word vector, we use JGibbLDA to discover latent topics and perform a conversion relationship between words and topics. The graphical description of LDA is given in fig. 2, in which  represents the distribution of words on emotion topics and θ represents the distribution of emotion topics on micro-blogs. Besides, α and β are calculation factors derived from Dirichlet function. In each loop iterate, we perform Gibbs sampling considering the present  and θ, and then use the sampling result to calculate new  and θ until their convergence, with correspondingly utilizing the calculation factors α and β. In this procedure, initial sampling is performed as that words are equally distributed on topics, and topics are equally distributed on documents [18].

α

θ(d)

z

β

ϕ(z)

w T

Nd D

Figure 2. Graphical representation of LDA model using plate notation.

After the creation of dimension-reducing machine, each tweet w in document collection D can be associated with a mixture of different topics in topic collection T, which are denoted as t1 ~ tN. In this paper, we use tweets containing a hashtag in one week as the semantic text of it, and the normalization result of the sum of all the topic vectors of those tweets is taken as the topic vector of this hashtag. After dimension reduction of tweets with topic models, we can easily calculate similarity between tweets or between tweets and hashtags with their topic vectors. 2) Hashtag related tweet number detection For a hashtag, the number of tweets, which contain this hashtag, can well represents the user acceptance degree in microblogging [6]. Therefore, a bigger number of this means a higher user acceptance degree of a hashtag, and the reverse is also true. In this paper, we take this number as one of the key factors for ranking hashtags and deciding whether or not to choose a hashtag as a candidate for a forthcoming tweet. For getting the number of hashtag related tweets, the tweet searching function in the microblogging sites is needed. If the hashtag has been inputted into the search box, the number of tweets containing this hashtag h will be returned on this page, which will be marked by us as nh. In this paper, we utilize log(nh) as the measurement to represent the user acceptance degree of h. 3) Hashtag development tendency detection The historical development of hashtags can be well reflected by their related tweets number statistical trend curves [19], and the development tendency of a hashtag at any given time can be detected with the slope of the its statistical curve at this given time. The greater slope for the

trend curve reflects more intense information penetration of this hashtag, and the reverse is also true [7]. Like the user acceptance degree, we also take development tendency as another key factor for ranking hashtags. For detecting which development tendency situation a hashtag is on, we firstly statistic the number of related tweets of the hashtag on each day and then we can get the time series. Due to the difficulty of downloading all hashtag related tweets, we just statistic the tweets within seven days as hashtags’ historical data. Then we utilize polynomial spline estimator to estimate the time curve of hashtags with their historical data [10]. The meaning of the polynomial spline estimator is explained as below: given a tabulated function fi = f (xi), i = 0, . . . I, a spline is a polynomial between each pair of tabulated points, but one whose coefficients are determined “slightly” nonlocally. The non-locality is designed to guarantee global smoothness in the interpolated function up to some order of derivative. Cubic splines are the most popular. They produce an interpolated function that is continuous through to the second derivative. Splines tend to be more stable than fitting a polynomial through the I + 1 points, with less possibility of wild oscillations between the tabulated points. With the polynomial spline estimator, we can infer the continuous curve of hashtags’ development tendency with the dispersed data in seven days. Finally, we calculate the slope of the continuous curve of a hashtag h on the given day and mark it as dh, which is very easy with the equation of inferred curve. Besides, we utilize the Sigmoid Function to normalize the slope value dh. The Sigmoid Function was first introduced by Gabriel Tarade for modeling the adoption rate of a new idea [12], and then widely used for modeling various phenomena [13]. Sigmoid Function is a bounded differentiable real function that is defined for all real input values and has a positive derivative at each point [13]. The formula expression of the Sigmoid Function is defined as:

xnormalization 

1 1  e x

(1)

where x is the slope of the appraisal curve on the given day, and this Sigmoid Function can normalize a real number on the interval ( - ∞ , ∞ ) into a real number xnormalization on the interval ( 0 , 1 ), i.e. the dh:

d h  xnormalization

(2)

C. Online Part 1) Calculating recommendation value of hashtags For a forthcoming tweet t, we consider three factors to get the recommendation value of a hashtag h, which we denote as r (t, h). The three factors have all been mentioned above, which are respectively the semantic similarity between h and t, the user acceptance degree of h, and the development tendency of h. For each hashtag on each day, the user acceptance degree and the development tendency of the hashtag have been fixed,

and we can easily get those two factors with the calculation on nh and dh described in the previous offline part section. For the semantic similarity between h and t, we need the calculation of the semantic similarity between the topic vectors of them. The topic vector of h has been calculated in the offline part section, and the topic vector for t is also very easy to get with the trained topic model. The semantic similarity between two topic vectors Vh and Vt is calculated with the cosine similarity formula below: N

cos(Vh ,Vt ) 

Vh .Vt  Vh Vt

 (w

hi

i 1

N

 (w i 1

hi

)  2

 wti ) (3) N

 (w ) i 1

2

ti

in which, the cos ( Vh , Vt ) means cosine similarity between Vh and Vt, and the whi means the value of Vh on the ith dimension, and the wti means the value of Vt on the ith dimension [11]. For simplified notation, we define the semantic similarity between h and t as s (t, h), i.e.,

s(t, h)  cos(Vh ,Vt )

(4)

Finally, those three factors are integrated together to calculate the recommendation value of hashtags. We define the recommendation value of a hashtag h for a forthcoming tweet t as r (t, h), and the formula of r (t, h) is given below:

r(t, h)  log(nh ) * d h * s(t, h)

(5)

2) Ranking and recommend hashtags for given tweets For online hashtag recommendation, the most important requirement is the fast processing speed. We try to improve the processing speed through two aspects: a) Filtration of less prominent hashtags: For reducing unnecessary time-consuming calculations, we filter some less prominent hashtags. Based on the description in the offline part section, the measurement to represent the user acceptance degree of h is log(nh), the value of which is zero when nh is 1, so we just need to calculate the recommendation value of hashtags of which the nh is larger than 1. Besides, since we just statistic the tweets within seven days as hashtags’ historical data, we can remove the hashtags, which didn’t exist in recent seven days, from hashtag candidate list. Based on those two steps, we can greatly reduce the number of the hashtag candidates. b) Creation of Lucene index: For the remaining hashtags after the above filtration steps, we create the Lucene index to improve the retrieval speed. Historical data of each hashtag need to be segmented into words and then all the non-stopwords will be taken as the index content. We consider that if a hashtag is related to a forthcoming tweet, at least one non-stopword in the tweet should has shown in the historical data of this hashtag. Therefore, we take the Lucene search result as a Boolean condition, and then the retrieved hashtags will be ranked with their recommendation values. Finally, top k hashtags in the ranking list will be recommended to author user of the forthcoming tweet, and the k can be set to different numbers according to our need.

IV.

EXPERIMENTAL RESULTS AND DISCUSSIONS

A. Experimental Data Sina Weibo is a well-known Chinese microblogging web service offering a fun and interactive way to discover and discuss information, which makes it easy to share personal feelings and opinions with friends online. To examine the performance of our model, we conducted an experimental study on real-world datasets collected from Sina Weibo. Since downloading all the tweets is difficult or even impossible, we choose to build a small set of data with the following steps: Step 1. We download tweets data of one day from Sina Weibo continuously with the public timeline API 1 , and collect all the hashtags used in those tweets as a hashtag list. Step 2. We track those hashtags with the microblogging search tool and download the tweets which contains them in one week, which are taken as their historical data. Step 3. 50 tweets without using any hashtag are randomly chosen from tweets data as assumptive forthcoming tweets. Besides, when running the Lucene search engine,we take each tweet as a query, and parse the tweet with a new analyzer based on the ICTCLAS, for keeping consistent analysis results with the index file. B. Methods for Comparison In this subsection, to evaluate the performance of the proposed algorithm, we compare it with the following baselines. Although we have mentioned the models in [3], [4] and [5] as our related work in the literature review section, we just take the models in [3] and [5] as our baselines, which are named as ‘TSTM’ and ‘TOMOHA’ respectively. This is because that our model can be applied to all kinds of tweets, but the ‘RankSVM++’ model in [4] can just be used for hyperlinked tweets, which just account for about 19.1% of our experimental data. Besides, for evaluating the effect of two hashtag related factors: user acceptance degree and development tendency, we design three other baselines:  SenSim: we take semantic similarity between hashtags and tweets as the only factor for ranking and recommending hashtags.  SenSim+Ac: except the semantic similarity, we also consider the user acceptance degree of hashtags as another factor.  SenSim+Te: except the semantic similarity, we also consider the development tendency of hashtags as another factor. For highlighting the factors used in our model, we named our model as SenSim+Ac+Te. C. Gold Standard Generation Since microblogging users always just care about the top ranking hashtags, we just evaluate top 20 hashtag recommendation results for 50 randomly chosen tweets of each model. Then, a list with random order of them will be provided to tagging volunteers. 1

http://open.weibo.com/wiki/2/statuses/public_timeline

Three graduate students were recruited to annotate the candidate hashtags. Each annotator should give each hashtag an integral ‘Selection Probability’ from 1 to 5. In the process of scoring, annotators can search the information related to hashtags from microblogging hashtag search engine, or just consider own understanding and interest on those hashtags. We then use the average value of the ‘Selection Probability’ as the final score, and Fleiss' Kappa is adopted to verify the degree of agreement among the three annotators, which is 0.73, indicating substantial agreements. D. Evaluation Metrics We use both Normalized Discounted Cumulative Gain at top k (NDCG@k) and Mean Average Precision (MAP) as our evaluation metrics, which are both fit for evaluating top results sensitive ranking problems. Discounted Cumulative Gain at top k (DCG@k) is defined as: k

DCG @ k ( y )   i 1

2reli  1 log(1  i )

(6)

where y means the candidate hashtag ranking list, and reli means the real recommendation value of the ith result in our result. Then the NDCG@k is accordingly defined as: NDCG @ k ( y ) 

DCG @ k ( y ) DCG @ k ( y*)

(7)

where y* is a perfect ranking result which corresponds to any perfect ordering based on the manual tagging scores. MAP is defined as the average precision on each day a real keyphrase is detected, which is given in Equation (8): 1 MAP  N



N i 1



i j 1

i

r( j ) * r (i )

(8)

where i is the position of the hashtag in the ranking list and we want to evaluate top N results. The r(i) is a binary value: when the suggested hashtag at position i is one of the top N ones, r(i) is set to be 1 and 0 otherwise. In this paper, we set the k in NDCG@k to be 3, 5, 10 and 20 respectively, and the N in MAP is set to be 20 according to the generating method of gold standard. After obtaining the NDCG@k and MAP of experimental results in each day, we further calculate the average value of them as final evaluation metrics. E. Experimental Results Figure 3 demonstrates the hashtag recommendation result comparison on 6 models, with NDCG@k (k = {3,5,10,20}) and MAP, from which we can see SenSim+Ac, SenSim+Te and SenSim+Ac+Te outperform other 3 models, TSTM, TOMOHA and SenSim. The improvement is due to the integration of user acceptance degree factor or development tendency factor of hashtags. In TSTM, TOMOHA and SenSim models, hashtags which have high semantic similarity with given tweets but with low user acceptance degree or development tendency will be recommended, even some hashtags which have appeared only once can be given a very high recommendation value. In SenSim+Ac, SenSim+Te and

SenSim+Ac+Te models, those hashtags with low user acceptance degree or development tendency will be removed, since those hashtags intuitively have very few contribution on the future information diffusion of the given tweets. Besides, TSTM, TOMOHA and SenSim models achieve similar effect on hashtag recommendation task, which demonstrates that semantic based methods can detect similar hashtags as candidates.

tweet authors’ emotional states. Besides, for the forthcoming tweets which have no highly content-similar hashtags, we plan to design a hashtag generation model for automatically summary the content of forthcoming tweets and then generate some new appropriate hashtags. REFERENCES [1]

[2]

[3]

[4]

[5]

[6] Figure 3. Hashtag recommendation result comparison on the 6 models, with NDCG@k (k = {3,5,10,20}) and MAP.

Next, let us discuss the comparison among SenSim+Ac, SenSim+Te and SenSim+Ac+Te. SenSim+Ac+Te is better than other two models which just use one additional factor except the semantic similarity value, and this result is predictable. It is worth discussing that the SenSim+Te is better than SenSim+Ac, which shows the development tendency factor is more effective than the user acceptance degree factor. This result shows that microblogging users may have a very deep impression on hashtags appearing frequently in recent one or two days, so the annotators give high recommendation values to those hashtags. Besides, the SenSim+Ac+Te just show minor improvement over SenSim+Te, this result again proves that the development tendency factor is very important and veritable for reflecting hashtags’ recommendation value.

[7]

[8] [9]

[10]

[11]

[12] [13]

V.

CONCLUSION AND FUTURE WORK

Microblogging is mainly used to note what is happening around the world and to strengthen communication among users in social network. By properly adding hashtags to forthcoming tweets, the readability and topicality of those tweets can be greatly improved. In this paper, we have presented a model for recommending both content-relevant and popular hashtags to forthcoming tweets, based on multiple factors. Our main contribution is that we firstly consider user acceptance degree of hashtags and development tendency of hashtags as key factors in hashtag recommendation task on microblogging platforms, which help us realize better hashtag recommendation results. In our future work, we may try to improve the calculation method of the semantic similarity between hashtags and tweets, and detect some other effective factors to make the hashtag recommendation results more accurate, such as the

[14]

[15] [16]

[17]

[18]

[19]

S. Carter, M. Tsagkias, and W. Weerkamp, “Twitter hashtags: Joint Translation and Clustering,” in Proceedings of the 3rd International Conference on Web Science, 2011, pp. 1-3. H. Bao, Q. Li, S. S. Liao, S. Song, and H. Gao, “A new temporal and social PMF-based method to predict users' interests in microblogging”, Decision Support Systems 55(3): 698-709 (2013). Z. Ding, Q. Zhang, and X. Huang, "Automatic Hashtag Recommendation for Microblogs using Topic-Specific Translation Model", in Proceedings of the 24th International Conference on Computational Linguistics, 2012, pp. 265-274. S. Sedhai, and A. Sun, "Hashtag Recommendation for Hyperlinked Tweets", in Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2014, pp. 831-834. J. She, and L. Chen, "TOMOHA: TOpic MOdel-based HAshtag Recommendation on Twitter", in Proceedings of the 23st International World Wide Web Conference, 2014, pp. 371-372. S. Song, Y. Meng, and J. Sun, “Detecting Keyphrases in Microblogging with Graph Modeling of Information Diffusion”, in Proceedings of the 13th Pacific Rim International Conference on Artificial Intelligence, 2014, pp. 26-38. X. Wang, J. Byrne, L. Kurdgelashvili, and A. Barnett, “High efficiency photovoltaics: on the way to becoming a major electricity source”, Wiley Interdisciplinary Reviews: Energy and Environment, vol. 1, no. 2: 132 – 151, 2012. D. Blei, A. Ng, and M. Jordan, “Latent Dirichlet Allocation”, Journal of Machine Learning Research, 3:993-1022, 2003. H. Zhang, H. Yu, D. Xiong, and Q. Liu, “HHMM-based Chinese Lexical Analyzer ICTCLAS”, in Proceedings of the Second SIGHAN Workshop on Chinese Language Processing, 2003, pp. 184-187. J. Z. Huang, C. O. Wu, and L. Zhou, “Polynomial spline estimation and inference for varying coefficient models with longitudinal data”, Statistica Sinica 14: 763-788 (2004). S. Song, Q. Li and H. Bao, “Detecting Dynamic Association among Twitter Topics”, in Proceedings of the 21st International World Wide Web Conference, 2012, pp. 605-606. G. Tarde, “The laws of imitation”, 1903, New York: Henry, Holt and Co. H. Ma, W. Qian, F. Xia, X. He, J. Xu, and A. Zhou, "Towards modeling popularity of microblogs", Frontiers of Computer Science, Volume 7, Issue 2, pp. 171-184. X-H. Phan, L-M. Nguyen, and S. Horiguchi, “Learning to classify short and sparse text & web with hidden topics from large-scale data collections”, in Proceedings of the 17st International World Wide Web Conference, 2008, pp. 91-100. M. Last, “Online classification of nonstationary data streams”, Intell. Data Anal. 6(2): 129-147, 2002. J. M. Kleinberg, “Bursty and hierarchical structure in streams”, in Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002, pp. 91-101. K. Than, T. B. Ho, D. K. Nguyen, and P. N. Khanh, “Supervised dimension reduction with topic models”, in Proceedings of the 4th Asian Conference on Machine Learning, 2012, pp. 395-410. N. Zheng, S. Song, and H. Bao, “A Temporal-Topic Model for Friend Recommendations in Chinese Microblogging Systems”, IEEE Trans. on Systems, Man & Cybernetics: Systems 45(*): *-* (2015). S. Song, Q. Li, and X. Zheng, “Detecting Popular Topics in Microblogging Based on a User Interest-Based Model”, in Proceedings of the 2012 International Joint Conference on Neural Networks, pp. 1-8.