Paper Title (use style: paper title)

4 downloads 183 Views 201KB Size Report
Establishing Social Network Communities between Students Based on Their ..... In this matrix, rows are Websites names belong to one category or sub-category.
International Conference on Computer & Communication Technology (ICCCT)-2011

Establishing Social Network Communities between Students Based on Their Internet Usage Patterns Rozita Jamili Oskouei Computer Science & Engineering Department Motilal Nehru National Institute of Technology Allahabad, India [email protected] access log files for extracting each individual user’s access pattern. Depending on each organization’s interest, activity and criteria, it might be important to focus on different points of users’ usage behaviors and community detection rules and consider security or privacy issues of user’s accounts or personal information. Sometimes there will be some limitation on the access time or contents by users, especially in business environments. This limitation might affect the number of users in a community or number of different communities among users within those organizations.

Abstract— World Wide Web (WWW) resources are used by various users around the World for different purposes with varying requirements. In engineering colleges, Internet is mainly used for academic purposes by students and professors. During a day, different academic Websites (AC) are visited several times by students with different level of knowledge and requirements. In this paper, we propose a requirement or usage based social network (SN) community creation among these users by analyzing their Internet usage patterns. One of the main advantages of creating this SN community is to exchange knowledge and provides an easy way for connecting weak students with other students and professors to solve their problems. Keywords- Social Networking Community, Websites, Internet Usage pattern, Usage mining

Another possible limitation regarding community detection would be concerned on different category of visited Websites’ usage and analysis results for detecting com convincing munity of users based on those usage patterns in academic environments. For example, the most important and attractive category of Websites, which is most focus point for academic course coordinators, instructors and administrators would be academic Websites (AC). These categories of Websites have academic relation directly or indirectly and the content or structure of page is related to books or journal, conferences, programming Websites, dictionaries, e-learning or distance education and etc, completely discussed on [1,4].

Academic

I. INTRODUCTION World Wide Web (WWW) includes different categories of Websites, which enable each individual user to browse their favorite documents through the various Websites. Based on each user’s requirements and necessities, s/he would open a URL and start searching required and favorite documents or Information within the contents of that Web page. Users of Internet belong to different environments and having variety of personal and social behaviors, age groups, knowledge, education, customs, and attitudes. Therefore, based on these attributes, their Internet requirements and usages may not be same. For instance, students and professors of an engineering college based on their native living or growing places or their age or gender, undertaken programs or semesters might have different usages of Internet resources during a day. Even in different periods of an academic semester, they can behave differently. Several consecutive analysis during two years [1] ~ [3], revealed and extracted differential Internet usage patterns and behaviors with various focusing points and analysis affects on various aspects, whereas users of a business organization or office environment will have completely different goals of using Internet. So, administrators at these two environments have to plan and manipulate strategies accordingly for better support and services to the users’ requests.

This paper is organized in six sections. Section 2 presents related work. Section 3 defines and discusses about social networks and usage mining. Section 4 presents our method for detecting off-line and online user communities. Section 5 presents our experimental results and section. Section 6 concludes the paper. II. RELATED WORK Several research efforts have been made related to extracting users usage behaviors on Internet by applying different methods are proposed and applications of these patterns on predicting future accesses exist [5]~[10]. In [5, 7, and 10], authors used usage mining for extracting users navigational patterns. The problems of extracting user’s access patterns from user access sequences (UAS) are usually used for user future access prediction and web page recommendation and are discussed in [11 ~ 19]. Social networking usages from different aspects are analyzed by several authors [20-24]. These authors studied

In this paper, we propose a model for extracting social networking community of users based on their similarity of Internet usage behaviors. This model uses proxy server’s

978-1-4577-1386-611$26.00©2011 IEEE

624

International Conference on Computer & Communication Technology (ICCCT)-2011

the effects of social networking usages of different activity and performances of students. Web based communities across the pacific Islands is studied in [25]. In this paper, different factors such as cultural, social and educational challenges are discussed and aimed on addressing some of educational and social needs of Pacific Islanders. In [26], authors concerned on developing social software under Web 2.0 based on people’s functional diversity. III.

technology for educational purposes. Authors of [33, 34 and 35] classified some other applications of web mining. Web mining includes the following three sub-categories [36, 37]: Web Content Mining: is concerned with the extraction of useful knowledge from the content of Web pages with the help of data mining techniques. - Web Structure Mining: is a new area and is concerned with the application of data mining to the structure of the Web graph. - Web Usage Mining: aims to discover users interesting usage patterns by analyzing Web usage data that is stored in proxy server access log files or server log files, etc. Web usage mining is mostly related to personalization. Web Usage Mining (WUM) consists of four basic stages identified for data mining: [38, 39] 9 Data Collection: During this stage, data are collected either from Web servers or from clients that visit a Web site. 9 Data Preprocessing: This is the stage that involves primarily data cleaning, user identification and user session identification. 9 Pattern Discovery: During this stage, knowledge is extracted by applying machine learning techniques such as clustering, classification, association rule discovery etc., to the data. 9 Knowledge Post-Processing: In this stage, the extracted knowledge is evaluated and presented in a form that is understandable to humans, e.g., by using reports, or visualization techniques. In addition to these techniques post-processed results can also be incorporated in a Web personalization module.

SOCIAL NETWORKING & WEB USAGE MINING

A. Social networking definition In the year 2003, the Web has become an active space of socialization for the majority of users [27]. A social network is defined as a social structure of individuals, who are directly or indirectly related to each other based on a common relation of interest such as friendship, trust, etc [28]. Social networking [29] is grouping of individuals into specific groups like small rural communities or a neighborhood subdivision and Internet is filled with millions of individuals who are looking to meet other people to gather and share firsthand information and experiences about cooking, golfing, gardening, developing friendships or professional alliances, finding employment, business-tobusiness marketing and even groups sharing information. Social networking data can be viewed as a social relational system characterized by a set of actors and their social ties. Additional information in the form of actor attribute variables or multiple relations can be part of the social relational systems [30]. B. Different Types of Social Networks According to [31], there are three types of social network types are defined. Table 1 shows basic properties of different kinds of social networks.

IV.

OUR METHODS FOR DETECTING SOCIAL NETWORK COMMUNITIES

Table 1: Comparing Basic Properties of Different kinds of Social Networks Social Network Actors Ties Direction Type Friendship People in Friendship Undirected Network Society relations between people Web’s Social Web pages Links Between Directed Network Web Pages Semantic Web Semantic Web Semantic Relations Directed Social Docs or Between or Concepts in Documents or Undirected Network Them Concepts

One of the main problems in educational environments is searching for useful resources for college students through the Internet for helping students for solving course related problems or learning some difficult tasks. But unfortunately this is not an easy task to find exactly right person who will help or who is interested to help. This paper attempts to solve this problem by developing online community. Our proposed modeling scheme has the following steps: 1- Extract users usage patter from server access log files 2- Discover each user’s Internet usage pattern in terms of: a. Average total time spent in Internet per a day b. URL of visited Website c. Extract the categorization of that Website based on our specified Website’s categorization scheme [1]. Based on this

C. Web Usage Mining Web mining is further appreciated as it utilizes the data mining techniques to automatically discover and extract information from web documents and services. In [32], Tanimoto had outlined various suggestions of using this

625

International Conference on Computer & Communication Technology (ICCCT)-2011

a.

Detect all similar graphs which can be made with different methods for detecting similar graphs b. Make different classifications based on similarity of graphs c. Analyze each groups’ usages of different sub-categories in Website classification scheme d. Identify the similarity of usages and all users with the similar Internet usage pattern per a day 5- Focus on special Sub-category of Websites in detailed cases a. Identify all users that visited that Websites or sub-categories b. Make a data base for recording those users’ Information including their user name and email id 6- Finally start to create online community or social network within the users a. Send an invitation for all users by attaching other users’ email id from data base b. If a user is interested for communication, can be send an email to other user and start for discussion about any related and interested solutions for solving problems With this approach, two individual users possibly may get several invitations based on similarity of visited Websites from each other.

scheme, all visited Websites categorized under two category based on content and structure of those Websites. Two main category are : i. Academic Websites including to : x EBooks x Software x Journals and Conference x All Universities’ Websites x etc

d. e.

ii. Non-Academic Websites including to : x Social Networking x Entertainments x Business x etc Map each users’ usage pattern on Website categorization scheme Create t a graph for each user based on his/her Internet usage pattern same as shown in Figure 1.

In Figure 2, two communities of students based on their Internet Web sites’ usage similarity is shown.

Figure 1: Modeling Each User's Usage Pattern Figure 2: Establishing Social Network Community based on Internet Usage Pattern

In Figure 1, the usage patterns of two users are presented based on classification scheme presented in [1, 4]. In this figure, both the users visit Academic Websites (AC) and Non-Academic Websites (NAC) categories of Websites. But in sub-category Websites usage mining, they are not having similar behaviors. 3- Select focus point a. In academic environments we proposed AC as a main analysis and community creation point. 4- Explore users usage pattern hierarchical graph

As it is showing in Figure 2, different users are having similar usages from special websites such as Java.com and ieee.org so academically exchanging knowledge or data within these users would be desirable and helpful for getting help by beginners or fresher students whenever the facility for introducing other students or professors whose visiting this academic Websites are available for them and they can make easily a social network group with all seniors or professors for solving problems or discussing about new solutions.

626

International Conference on Computer & Communication Technology (ICCCT)-2011

Another point is shown in Figure 2 that is presenting an individual user in more than one community or academic social network groups. In Figure 2, we can see that two users which are connected by red lines are connecting in both community because of having both java.com and ieee.org in their usage patterns. V.

OUR EXPRIMENTAL RESULTS

-

We collected access log files for a during of 30 months including 5 semester’s all normal days and mid-term examination weeks and final examination weeks. Different analysis results are shown in [1] ~ [4]. A new approach for detecting users temporal and periodic usage patterns are presented [40]. This paper proposed a method for detecting user’s similar usage behaviors which is used a matrix for showing each Websites used by different users. 1 if user (Ui) visits Website (Wj) W (I,j) = 0 if user (Ui) does not visit Website (Wj)

-

So the following matrix will be display all users with similar usage pattern. U1

W01 W02 W03 W04 Web-Category=C ) = -

Wn

1 0 1 0 1 1 1

U2

U3

U4

U5

1 1 0 1 0 1 0

0 1 0 1 0 1 0

0 1 1 0 0 1 1

0 1 0 0 0 1 1

an invitation email for all of those users and ask if they like to join with other students whose already visited this Website can be inform us with replying this email to us. In this invitation email already we note other connected users’ branch and semester without their user name or email id. In the next step, after replying interested students , we select all users’ user names , which accepted invitation with other users or specific users with special branch and semester. For security purpose up to final step, we do not send any individual user’s user name or email id to other users. Just after accepting invitation the first communication between those users would be provide with our controlling methods and after first communication they will be able to exchange email id or phone or etc with together and manage themselves these communications.

More attention is needed regarding community creation based on AC category usages and relating students. This seems especially in sub-category eBook and programming students were interested to participate in our proposed communities and they accept most of the invitations regarding similarity of their programming Websites usages as well as eBooks. VI. CONCLUSION In this paper, we discussed about establishing a Website usage based social network community among users in an academic organization. The primary focus was to detect communities based on their academic related category of Websites visiting behaviors. We used invitation sending method for privacy and security issues. Extracting community can be done on offline or online access log files. In the form of offline, access log files are used and invitation to join the network will be sent for a fixed time. For example, an invitation will be sent every 12 hours. In online method, whenever a user opened a website, her/his information would be saved in our server’s database and invitation will be sent immediately for all visitors of that Website. Our future work will be considered on establishing dynamic social network among students of different colleges and other environments.

In this matrix, rows are Websites names belong to one category or sub-category. For example, AC category or subcategory SN from NAC category and columns are users’ user names. If Website Wj is visited by user Ui and marked as 1 and if not visited will be marked as 0. By extracting all 1s from each row, we would be able to detect user’s community and also sum of these 1’s in each row could show the traffic for visiting that Website. Our results found huge amount of traffic in SN Websites especially within female users where as Entertainments (EN) categories Websites traffic for male students especially during examination periods. A brief detection regarding undesirable Websites visiting discovered for both girls and boys students. We used these results for analyzing affects of these Website usages on those students’ academic performances.

Acknowledgment I am thankful to my professor B.D.Chaudhary for helping me in preparing this paper. I also acknowledge the help rendered to me by the staff of computer center and Dean (Academic Affairs) office.

In this research, we select all student users from different branches whom connect during one week in Internet and visited java.com . - We create a database for recording all users’ user name , which they had in their access log file’s an record of visiting this Website along with their email id, for security purpose we send

627

International Conference on Computer & Communication Technology (ICCCT)-2011

[20] Boyd, D.M & Ellison,N.B(2007), “ Social Sites: Denition, History and Scholarships , Journal of Computer Mediated Communication” , article11, http://jcmc.indiana.edu/vol13/issue1/boyd.ellison.html [21] Dr. Jaideep Srivastava IEEE ISI 2008 Invited Talk (III) , “Data Mining for Social Network Analysis” [22] Haria Liccardi, Asma Ounnas, Reena Pau , Elizabeth Massey, Pavi Kinnunen, Sarah Lewthwaite, Marie-Anne Midy , Chandan Sarkar " The Role of Social Networks in Students' Learning Experiences " http://portal.acm.org/citation.cfm?id=1345375.134544 [23] K.Figl, S.Kabicher, K.Toi , “Prompoting Social Networks among Computer Science Students 38th ASEE//IEEE Frontiers in Education Conference October 22-25, 2008, Saratoga Springs, NY [24] R. Jamili Oskouei, “ Analyzing Different Aspects of Social Network Usages on Students Behaviors and Academic Performance” , 2010 IEEE, Technology for Education (T4E), Bombay, pp.216- 221 [25] M.Lding, J.Skouge, J.Peter , “ The Computer and the Canoe: Webbased Communities across the Pacific Islands “ Int.J.Web Based Communities, Vol.4 , No.1, 2008 [26] O.C.Santos and J.G.Boticario , “ Requirements for building accessible web-based communities for people with functional diversity “ Int.J.Web Based Communities, Vol.4 , No.1, 2008 [27] P.Mika, “ Flink: Semantic Web Technology for Extraction and Analysis of Social Networks “ Journal of Web Semantics , Vol.3, pp.211-223, october 2005 [28] Dr. Jaideep Srivastava IEEE ISI 2008 Invited Talk (III) ,Data Mining for Social Network Analysis [29] http://www.whatissocialnetworking.com/ [30] S.Wasserman, K.Faust , “ Social Network Analysis : Methods and Applications” , Cambridge university press, November 1994 [31] M.Jamali and H.Abolhassani , “ Different Aspects of Social Network Analysis “ , IEEE/WIC/ACM International Conference on Web Intelligence (WI’06) [32] Steven L.Tanimoto Improving the Prospects for Educational Data mining, http://www.educationaldatamining.org/UM2007/Tanimoto.pdf [33] Boyd, D.M & Ellison,N.B(2007), Social Sites: Definition, History and Scholarships , Journal of Computer Mediated Communication, article 11, http://jcmc.indiana.edu/vol13/issue1/boyd.ellison.html [34] A.Koohang, Students Perceptions toward the use of the digital library in weekly web-based distance learning assignment portion of a hybrid programme British Journal of Educational Technology, Vol 35, pp.617[35] 626, 2004 on the World Wide Web , In proceedings of the 9th IEEE International Conference on Tools with Artificial Intelligence (ICTAI97) Technology, Vol 35, pp.617-626, 2004 [36] R. Cooley, B. Mobasher, and J. Srivastava Web mining , Information and Pattern Discovery on the World Wide Web , In proceedings of the 9th IEEE International Conference on Tools with Artificial Intelligence (ICTAI97) [37] Kosala, R, & Blockeel, H., (2000). “Web mining Research : A Survey” , SIGKDD Explorations [38] Buchnner, A.G, Mulvenna , M.D. , Anand , S.S, & Hughes, J.G,(1999) “An internet-enabled knowledge discovery process”, In proceedings of 9th International Data base Conference . Hong Kong, pp13-27 [39] Dimitrios Pierrakos, G.Paliouras, C.Papatheodorou, C.D.Spyropoulos ,KOINOTITES: A Web Usage Mining Tool for Personalization , http://www.ionio.gr/ papatheodor/papers/F63.pdf [40] R. Jamili Oskouei “A Novel Approach for Extracting Students Temporal and Periodic Internet Usage Behavior” International Journal of Computer Applications (0975 – 8887) Volume 4 – No.6, July 2010

REFERENCES [1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

R. Jamili Oskouei and B.D.Chaudhary, “Internet Usage Pattern by Female Students: A Case Study” , ITNG 2010, IEEE 7th International Conference on Information Technology, USA,pp.1247-1250 R. Jamili Oskouei, “ Behavior Mining of Female Students by Analyzing Log Files “ , IEEE 2010 Fifth International Conference on Digital Information Management(ICDIM), pp.5-10 R. Jamili Oskouei, “Differential Internet Behavior’s of Students From Gender Groups” , International Journal of Computer Applications , Vol.4, No.7 , 2010, pp.36-42 R. Jamili Oskouei, “Impact of Non-Academic Websites Usage on Female Students Academic Performances(A Case Study)” , (ICENT) 2010, IEEE , International Conference on Education and Network Technology, China, pp.29-33 J.Srivastava, R.Cooley, Mukund Deshpande, P-Ning Tan, “ Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data “ , SIGKDD, Jan 2000, Vol.1, Issue 2-pp. 12-23 F.Khalil, J.Li, H.Wang , “ An Integrated Model for Next Page Access Prediction” , International Journal knowledge and Web Intelligence (IJKWI), Vol.1, Nos.1/2, 2009 F.Masseglia, P.Poncelet, M.Teisseire , “ Web Usage mining: Extracting Unexpected Periods from Web Logs” , Springer Science+Business Media, LLC 2007 E.Adar, D.S.weld, B.N.Bershad,S.D.Gribble, “ Why We Search : Visualizing and Predicting User Behavior” , www.2007/Track:Data mining I.Yuan Lin,X.M.Huang, M.S.Chen, “ Capturing User Access Patterns in the Web for Data Mining” , http://www.computer.org/portal/web/csdl/doi/10.1109/TAI.1999.809 818 Y.Dong, H.Zhang, L.Jiao, “ Research on Application of User Navigation Pattern Mining Recommendation” , Proceedings of the 6thWorld Congress on Intelligent Control and Automation , June 2123, 2006, Dalian, China [20] Li Xue,Ming Chen,Yun Xiong,Yangyong Zhu, “ User Navigation Behavior Mining UsingMultiple Data Domain Description “ ,2010 IEEE/WIC/ACM ,International Conference on Web Intelligence and Intelligent Agent Technology Michiaki Iwazume, Kengo Shirakami, Kazuaki Hatsdani , H.Takeda, T.Nishida IICA: “An Ontology- based Internet Navigation System AAAI” Technical Report WS-96-06 (www.aaai.org) Bettina Berendt “Web Usage mining, Site Semantics, and the support of navigation”, http//robotics.stanford.edu/ronnyk/WEBKDD2000/papers/berendt.pdf E.Marques, A.C.Garcia, I.Ferraz RED, “ A Model To Analyze Web Navigation Patterns”, www.research.ibm.com/iuiworkshop/.../marquesred cadui.pdf Y.Dong, H.Zhang, L.Jiao, “ Research on Application of User Navigation Pattern Mining Recommendation”, Proceedings of the 6thWorld Congress on Intelligent Control and Automation , June 2123, 2006, Dalian, China Leszek Borzemski, “ Internet Path Behavior Prediction via Data Mining:Conceptual Framework and Case Study” , Journal of Universal Computer Science, vol.13,no.2(2007), 287-316 I.Yuan Lin,X.M.Huang, M.S.Chen, “Capturing User Access Patterns in the Web for Data Mining”, http://www.computer.org/portal/web/csdl/doi/10.1109/TAI.1999.809 818 M.Jalali, N.Mustapha, A.Mamat, Md.Nasir B.Sulaiman, “ A New Clustring Approach based on Graph Partitioning for Navigation Patterns Mining “ , IEEE 2008 H.Ling, Y.Liu, S.Yang, “ An Ant Colony Model for Dynamic Mining of Users Interest Navigation Patterns”, IEEE International Conferences on Control and Automation, 2007

628