Learning Styles Vs Suitable Courses - IEEE Xplore

4 downloads 8812 Views 392KB Size Report
Department of CSE. VNR VJIET. Hyderabad, India [email protected]. Abstract— Most of the students are forced to take different courses based on their ...
Learning Styles Vs Suitable Courses Lakshmi Sreenivasa Reddy. D

M.Rao Batchanaboyina

Department of CSE Rise Gandhi Groups of Institutions Ongole, India

Department of CSE PACE Institute of Tech & Science Ongole, India

[email protected]

[email protected]

D.V.V.S.Phanikumar

Dr B.Ravindrababu

Department of CSE Rise Prakasam Groups of Institutions Ongole, India

Department of CSE VNR VJIET Hyderabad, India

[email protected]

[email protected]

questions. From each set of 11 questions, two learning styles strengths can be identified. From 44 questions, the strengths of

Abstract— Most of the students are forced to take different courses based on their parent’s interest, not of their interest. Some students are selecting their courses without knowing their inner ability. In this paper, how the student should select the different courses based on their learning styles in different levels is derived. This is achieved by eliminating the outliers in collected data from students. Since the data collected from students based on their learning styles is categorical, outlier detection analysis for categorical data is used to eliminate outliers from this data. These outliers are occurred while collecting data from students. Because some students are very peculiar, some students are not interested to reveal their data, some students may give wrong answers for any questionnaire by bias and some students may give incomplete data due to lack of time. The data is collected from B.Tech students from different colleges for experiments. After eliminating outliers from this data by proposed outliers’ techniques, different classifiers are applied to frame set of rules to select suitable courses based on their learning styles. The results are better when proposed method is applied. Keywords— ILS questionnaire, Frequency conditional probability, Felder and Silverman

I.

8 learning styles are identified. In [2], numerical models, like k-NN, density based, distance based, cluster based methods etc., are used to spot learners’ learning styles. To select suitable course, it is necessary to create a classification model on learners learning behaviors collected form ILS Questionnaire which is developed by Felder and Silverman, and it consists 44 questions. This questionnaire is divided in to four dimensionsactive /reflective, global / sequential, sensing / intuitive, visual / auditor. Each dimension consists of 11 questions. Along with this questionnaire, some other personalized questions are also included which considers some factors related to his / her education environment. Many factors reduce the reliability of the model of predicting whether a student succeeds or not. Preprocessing is needed to wipe out these factors and to extract the meaningful information from the collected learners’ data. The proposed outlier analysis method is one of the pre-processing methods for this education data.

matrix,

II.

Outliers are the records which do not comply with other objects to make a model and to generate patterns from the data, because these outliers have extraordinary behavior when compared with other data objects. These would differ significantly with other objects in data base. So these are called outliers. These outliers lead us to take wrong decision making. Outliers arise from different factors like measurement errors, data entry mistakes. These can be found easily by applying existing methods such as distance based, density based, statistical based etc. However in education data the outliers are different. These outliers are occurred based on psychological factors of students in education data. Consider a system which collects learners learning information from their problem solving process and creates a learner’s learning model based on the rate of incorrect answers and the time required per problem. Sometimes the students want to finish the exam early then the expected data cannot be obtained. If

INTRODUCTION

This paper is organized into five parts. Part one includes the discussion about the literature and the current status of the work. Part two introduces different definitions about outliers and how the outliers occur in education data. Part three explains the approach of finding outliers in different types of data. The data made use of here is the categorical data. The attempt is made to find an outlier model for dealing categorical supervised data. Part four deals with experimental results. Part five gives the direction for the future work and conclusion. In this paper “Index of learning styles “designed by Felder & Silverman [1], is used for collecting data from B.Tech students. This questionnaire contains 44 questions related to different learning styles and a class label attribute, whether a student succeeded or not in B.Tech. All these questions (Attributes) have two options. Each learning style contains 11

c 978-1-4799-1626-9/13/$31.00 2013 IEEE

OUTLIERS ARE DIFFERENT IN EDUCATION DATA

152

this data is considered, the model would be degraded. It is needed to consider such data points to be outliers. III.

LEARNING STYLES

In order to make the approach applicable, the knowledge about different styles of learning is essential. A.

Active/Reflective learning styles. With the help of learner’s data, active learners are

characterized as “The learners who preferred to process information actively” by spending more time on examples discussing, explaining or testing about the information with others. Here the same work was done by the reflective learners alone and prefer to think about the material before process. The student’s preference for active learners and reflective learners indicates by communication tools like discussion forums regarding discussion and explaining doubts and explaining something more beyond expected by active learners. Frequent reading is done by reflective learners carefully and preferred to participate passively. The self assessment tests as well as spending overall more time on exercises were performed by active learners, because they prefer for testing and trying things out. Since they preferred doing something by themselves, they are supposed to spend only time on examples rather than how the problems were solved by someone. Reflective learner prefer for spending more time on reading material like content objects as well as staying larger at outlines was expected. The results were reflected by their self assessment tests as well as on the result pages of self assessments and exercises. Then the reflective learners expect the same question to get answers in a self assessment test. B.

Sensing/Intuitive learning styles

The sensing learners prefer concrete material. This learning style, analyzes the performance on questions about facts and theories, their underlying meaning called the abstract material which is performed by intuited learners. Sensing learners prefer examples by learning concrete material. The objects are learned by the intuited learner and these examples are used as supplementary material. Therefore the time spent on content object tends to be high and time spent on examples tends to be low. A higher interest is learning in examples, existing approaches and self assessment tests, exercises to check the acquired knowledge etc. Problems are solved by sensing learners are by examples; whereas intuitive learner tends to be more creative and prefer challenges. Therefore the better answers to questions to develop new solutions are expected by intuitive learners, which required the understanding of underlying theories and concepts. The work is done by carefully and slowly with more details by sensing learners. So the self assessment tests are considered as patterns because of long time. The sensing learners are expected more time on reviewing their results. Therefore the performance on questions about details can be indicated by careful details

C.

Visual/Verbal learning styles

Visual learners are interested to learn anything very clearly by graphics, images and flow charts where as the verbal learners are preferred to learn from words. So the other patterns can act by the performance on questions about graphics as well as on text. Furthermore communicating and discussing with others are liked by verbal learners. Thus the verbal learning style is indicated by a more number of visits and postings as well as high amount of time spent in a discussion forum. Visiting reading materials such as content objects are more often done by verbal learners. D.

Sequential /Global learning styles

Sequential learners are more comfortable with details but global learners feel good in seeing the” big picture” and connections to other fields. The overview of concepts is dealt by putting questions and the details are served as patterns for this dimension by the connection between concepts and questions. The “big picture” was taken by the global learners on their interests, outline of the course and the chapters are especially important for them. The global learning styles are to interpret predefined solutions and to develop new solutions which require connecting topics to each other. Thus the respective questions are performed better. The navigation of a course is acted as a pattern denoting global learning style. The course is gone through step by step by the sequential learners in a linear way. Global learners tend to learn in large leaps, by skipping leaning objects and jumping to more complex material. IV.

PROPOSED APPROACH

A. Collection of raw data The data has been collected from engineering students using moodle tool with ILS questionnaire from four colleges of different branches. For this work, the branches ECE, CSE, CIVIL, EEE are considered and about 500 records have been collected. Along with forty four ILS questions, 10 questions related to personal information which influence learning environment are designed as proposed questionnaire. B. Learning Style Matrix (LSM): The above collected data is converted into LSM which consists only categorical data with the 9 dimensions “act/ref”, “sen/int”, “vis/verb”, “seq/glob”, “medium”, “INT.MAR,” “INT.Math.Mar”, “Branch” and “B.Tech Result” like below. If a student got more than 65% percentage in aggregate marks up to fourth year first semester it is treated as success in “B.Tech result” attribute (which is a class label attribute), otherwise it is treated as “Not successful”. In medium attribute there are two values. Telugu medium (T) and English medium (E)up to intermediate standard. In INT.MAR attribute contains two values which are also Success(S) (>80%), Not success (NS) (85%), Not Success (65 %.) Rule 2: If (Student got more than 85% in Intermediate Mathematics) Ÿ (Success Rate of B.Tech=74.903%) Rule 3: If (Student got more than 85% in Intermediate Mathematics) Ʌ If (Selects B.Tech branch=Civil, ECE, EEE,IT,MCA) Ÿ (Success Rate of B.Tech=77.523%) Rule 4: If (Student got more than 85% in Intermediate Mathematics) Ʌ If (Selects B.Tech branch=CSE) Ÿ (Success Rate of B.Tech=60.976%) Rule 5: If (Student got more than 85% in Intermediate Mathematics) Ʌ If (Selects B.Tech branch=CSE) Ʌ If (Medium =E) Ÿ (Success Rate of B.Tech=63.889%) Rule 6: If (Student got more than 85% in Intermediate Mathematics) Ʌ If (Selects B.Tech branch=CSE) Ʌ If (Medium =T) Ÿ (Success Rate of B.Tech=40.000%) Rule 7: If (Student got less than 85% in Intermediate Mathematics) Ÿ (Success Rate of B.Tech=34.359%) Rule 8: If (Student got less than 85% in Intermediate Mathematics) Ʌ If (Selects B.Tech branch=Civil) Ÿ (Success Rate of B.Tech=72.500%) Rule 9: If (Student got less than 85% in Intermediate Mathematics) Ʌ If (Selects B.Tech branch=CSE, ECE, EEE, IT) Ÿ (Success Rate of B.Tech=24.515%) Rule 10: If (Student got less than 85% in Intermediate Mathematics) Ʌ If (Selects B.Tech branch=CSE, ECE, EEE, IT) Ʌ If (Student got > 80% in Intermediate) Ÿ (Success Rate of B.Tech=40.000%) Rule 11: If (Student got less than 85% in Intermediate Mathematics) Ʌ If (Selects B.Tech branch=CSE, ECE, EEE, IT) Ʌ If (Student got < 80% in Intermediate) Ÿ (Success Rate of B.Tech=14.737%) Rule 12: If (Student got less than 85% in Intermediate Mathematics) Ʌ If (Selects B.Tech branch=CSE, ECE, EEE, IT) Ʌ If (Student got > 80% in Intermediate) Ʌ If (Student Learning Style= Verbal) Ÿ (Success Rate of B.Tech=66.667%) Rule 13: If (Student got less than 85% in Intermediate Mathematics) Ʌ If (Selects B.Tech branch=CSE, ECE, EEE, IT) Ʌ If (Student got > 80% in Intermediate) Ʌ If (Student Learning Style= Sensitive) Ÿ (Success Rate of B.Tech=33.333%) Fig. 2. Rule set generated by CRT for K=40 through AVF

disturbance (Bad behavior) of each record in data set. These outliers are disturbed the entire data while modeling set of rules. It finds k- outliers as those k-highest BAD score records. This algorithm has been applied on collected learning style data. In this model it is defined that Dataset is defined as D = {A1, A2------ Am}, Domain of an attribute ‘j’is defined as D (Aj), Set of all distinct values in dataset ‘D’ is defined as V = D (A1) D (A2) D (A3)….. D (Am) = {V1j, V2j V3j V4j ….Vkj} Here 1 d k d n, 1 d j d m for each object TABLE. 3. Comparison of Classifier Accuracies

Classifier

Number of outliers k=20

Number of outliers k=30

Number of outliers k=40

AVF

BAD

AVF

BAD

AVF

BAD

C&R Tree

75.73

75.68

75.93

76.27

75.43

75.80

QUEST

74.68

74.41

74.83

75.16

74.35

74.51

76.940

74.78

76.24

76.05

74.78

76.24

73.614

73.27

72.35

73.39

73.70

73.43

CHAID

75.52

75.26

76.60

C5

76.16

75.89

75.71

Logistic regression

73.20

73.15

Neural N/W

72.36

71.88

72.40 73.95

Then the proposed method to find BAD score for each record is given below.

156

BADscore( Xi)

1 Score1( Xi)  score2( Xi)

(2)

Here m ª ( f (Vkj )  1) § ·º Score1( Xi)  ¦ « log 10 ¨¨© ( f ((Vkjn1))1) ¸¸¹ » (3) ¦ j 1 ¬VkjD ( Aj ) Xij Vkkj ( n  1) ¼

m ª ( f (Vkj ) § ·º Score2( Xi)  ¦ « ¦ log 10 ¨¨© ( (f n(Vkj1) ) ¸¸¹ » j 1 ¬VkjD ( Aj ) Xij zVkkj ( n  1) ¼

(4)

Outliers are deleted from the above methods first, and then different classifiers are applied to frame set of rules about “How to select a course with different learning styles with different learning environment”. After framing the set of rules by classifiers, these rules are tested for accuracy of those found rules. Accuracies of set of rules are improved after deleted the outliers by AVF and BAD score algorithms. Among these two algorithms, BAD score approach gave the good set of rules to select courses. V.

EXPERIMENTAL RESULTS

A. Comparison of classifier(set of rules) accuracies In collected data, total records are 493 and there are nine attributes with all categorical values. There is one class label

2013 IEEE International Conference in MOOC, Innovation and Technology in Education (MITE)

attribute “B.Tech Result” with two values “Success” and “Not success”. When a student gets more than 65% marks up to 4 th year 1st semester it is treated as the student result is “Success”. Otherwise his result is treated as “Not success”. The algorithms have been used to delete 20, 30 and 40 outliers from 493 records. After outliers are deleted from the above algorithms, different classifiers are applied to frame set of rules. Comparisons of the accuracies of the set of rules framed by different classifiers are given in TABLE.3.

[3] [4] [5] [6] [7] [8]

[9] [10] [11]

[12]

[13]

[14] Fig.3.Comparison of Classifiers Accuracies

From the above figures rule sets are framed to select suitable courses to get success based on their learning styles. Comparing all the above decision trees, a decision tree formed by BAD score algorithm gave more accurate when tested with original data. From the Decision tree generated by BAD score algorithm the support and confidence also increased when compared with other algorithm and with original data without deleting outliers. Here these rule sets are generated by CRT classification model. Among all these Classifiers generated by various outlier techniques, the classifier generated by BAD score algorithm gave good accuracy. Similar like rule set generated by CRT Classifier which is fund by AVF in Figure2 we can frame rule sets for other classifiers also. VI.

[15]

[16]

[17]

[18]

[19]

Tal Bok yorn, “IEEE work shop improvement of learning styles diagnosis based on outliers reduction of user interface behaviours”, 2007 Zengyou He “An optimization model for outlier detection in categorical data” C.E.Shannon “A Mathematical theory of communication. Bell System Technical Journal”, 1948, pp.379-423 Han J Kamber, “Data mining concepts and techniques elseiever”, 2001 Analysis of Felder –Silverman index of learning Styles by data driven Statistical approach 0-7695-2746-9/06, 2006 M. E. Otey, A. Ghoting, and and A. Parthasarathy, "Fast Distributed Outlier Detection in Mixed-Attribute Data Sets," Data Mining and Knowledge Discovery He, Z., Deng, S., Xu, X., “A Fast Greedy algorithm for outlier mining”, Proc. of PAKDD, 2006. E. Knorr, R. Ng, and V. Tucakov, "Distance-based outliers: Algorithms and applications," VLDB Journal, 2000. M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander, "LOF: Identifying density based local outliers," presented at ACM SIGMOD International Conference on Management of Data, 2000 Z. He, X. Xu, J. Huang, and S. Deng, "FP-Outlier: Frequent Pattern Based Outlier Detection”, Computer Science and Information System (ComSIS'05)," 2005 Shu Wu and Shengrui Wang, “Information-Theoretic Outlier Detaction for Large-Scale Categorical Data, IEEE Transactions on Knowledge Engineering and Data Engineering,2011 LakshmiSreenivasaReddy.D,B.Raveendrababu, A.Govardhan “Outlier Analysis of Categorical Data using NAVF” Informatica Economică vol. 17, no. 1/2013. Lakshmi SreenivasaReddy.D, B.Raveendrababu, “Outlier Analysis of Categorical Data using FuzzyAVF, IEEE ICCPCT-2013, International conference, 2013. LakshmiSreenivasaReddy,D,B.Raveendrababu, A.Govardhan, “A Model for Improving Classifier Accuracy for Categorical Data Using Outlier Analysis” , international journal of computers & technology, Volume 7, no 1,2013. LakshmiSreenivasaReddy.D,B.Raveendrababu, A.Govardhan, “A Novel Approach to Find Outliers in Categorical Dataset”, Elsevier AEMDS2013. M. Krishna Murthy, A. Govardhan, Lakshmi SreenivasaReddy D,” Outlier Analysis of Categorical Data Using Infrequency”,International journal of Computers and Technology (IJCT) Volume 8, No 3, 2013. M. Krishna Murthy, A. Govardhan, Lakshmi SreenivasaReddy D,” A model to find outliers in mixed-attribute datasets using mixed attribute outlier factor.”,International journal of Computers Science Issues (IJCSI) Volume 10, Issue 5, No 2, 2013.

CONCLUSION &FUTURE WORK

The effect of Outliers deletion could be distinctly seen in the results because of correct classification in the remained data comparison with original data. Experimental result indicates that outlier elimination improves the reliability of the decision tree. With this lead, it can be attempted developing a system that can guide the students in choosing the most suitable branch in Engineering based on their learning style to succeed in their field of choice. In future work further analysis can be carried out to see how changes (considering more Constraints) in the proposed model effect the improvement in the data. REFERENCES [1] [2]

Index of Learning Styles Desigined by Felder & Silverman NCSU, USA. yongsekim “IEEE conference on outlier analysis of learners data based on user interface behaviors”,2007

2013 IEEE International Conference in MOOC, Innovation and Technology in Education (MITE)

157