Keylogger Keystroke Biometric System

24 downloads 0 Views 401KB Size Report
Vinnie Monaco, and Ned Bakelman. Pace University Seidenberg School of CSIS, ..... Information Systems Security. Sudbury, MA: Jones. & Bartlett Learning ...
Keylogger Keystroke Biometric System Brian Tschinkel, Bernard Esantsi, Dominick Iacovelli, Padma Nagesar, Richard Walz, Vinnie Monaco, and Ned Bakelman Pace University Seidenberg School of CSIS, White Plains, NY 10606, USA {bt66343n, be55006w, di80499p, pn53434n, rw195606p}@pace.edu, vinmonaco, nbakelman}@gmail.com government issues driver’s licenses which include eye color, hair color, height, etc. These physiological characteristics are used to identify and authenticate a person as the person they claim to be. Businesses employ these methods for validating user’s information online, especially in the electronic banking industry. Banks often ask users to provide a user ID, password, answers to personal security questions, date of birth, and social security number, etc. This information is validated against the stored data in the bank’s database system to verify that the right person is accessing the right account.

Abstract The system developed uses an open-source keylogger to capture data samples of all keystroke input. The keylogger output is converted to a data file format appropriate for processing by the Pace Keystroke Biometric System (PKBS). This study evaluates the overall system to determine the accuracy of correctly authenticating users based on their recorded keystroke patterns.

1. Introduction Biometrics can be defined as the study of human traits to identify and verify a person based on their physiological and behavioral characteristics. Physiological characteristics include fingerprint, DNA, iris recognition, facial recognition, palm print, and hand geometry, while behavioral characteristics include typing rhythm, voice, and gait [1]. Since the beginning of time, facial recognition has been used to identify individuals. In ancient civilization, palm prints and fingerprints were used to differentiate one person from another. In the 1880s, Francis Galton discovered that fingerprints do not change over time; he calculated the odds of two people having the exact same fingerprint was 1 in 64 billion [7]. These methods worked well and were sufficient in small communities. However, as society progressed and people began to migrate, changes were necessary to prevent theft, fraud, and other criminal activities. Computer technology has made a significant contribution to improve the collection and measurement of raw data to enhance the accuracy and uniqueness of the biometric system. Biometrics is essentially a pattern recognition system that consists of three main components: data collection, feature extraction, and classification. These components are used to collect data samples and compare them to verify a person’s identity. Figure 1 shows the process of matching samples with the data stored in a biometric database system [1]. Technology has paved the way for private industries and government agencies to incorporate biometric technologies that provide solutions for crime prevention, positive identifications, and various other methods that increase security. The government utilizes these technologies to issue passports, driver’s licenses, visas, and other identification cards. For example, the

Figure 1. Basic block diagram of a biometric system [1]

Biometric technologies have increased and enhanced the study of behavioral characteristics, such as typing rhythm, voice, and walking gait. For instance, biometric computing can be applied to voice recognition to identify the person speaking (known as speaker recognition) and what the person is saying (known as speech recognition). Law enforcement officials use this technology to measure voice pitch, speaking style, vocal chords vibration, and format frequencies to prove or disprove the identity and authenticity of people involved in crimes. A biometric voice print can be just as unique as a fingerprint. Gait Biometrics is the study of human bodily movement. This can be quite discerning, particularly with people who have walking disabilities. It is used to identify and treat individuals with injuries and help professional athletes with their performance [1]. Keystroke dynamics is the process of capturing typing rhythms typically through the use of timing measurements such as key press (key down) and release (key up) times [1]. Features can then be determined from these timings 1

and used to discriminate one individual’s typing pattern from another with a fair amount of accuracy. Our study will focus on the keystroke dynamics with particular interest on keystrokes generated from spreadsheet and web browsing input. We’re mainly interested in authentication using both text and numeric keypad entries. We will generate spreadsheet data samples using a specific template and data samples from the Internet which will then be transmitted to a centralized server. These samples will be collected and converted to run through the Pace Keystroke Biometric System (PKBS) for analysis and performance evaluation.

3. New Pace Keystroke Biometric System The Pace Keystroke Biometric System (PKBS) has undergone many revisions in its seven-years of existence. Using the Fimbel Keylogger, the system has now been adapted to incorporate keystroke and mouse input completely independent of any application(s) running on an individual’s computer (i.e., open-based web browsing, spreadsheets, instant messaging, etc.). While the old system used a Java-based application to capture data input, the new system relies on Fimbel’s keylogger to do so. The Java input system had several limitations that impacted the flexibility of the data being collected. Specifically, the Java application only captured data for text-based inputs within a confined text box [2] and therefore could not capture data from a spreadsheet application or a Web browser. In order to provide more thorough analysis of different kinds of input, the current system was revised to incorporate data from any source. The new frontend of the PKBS now relies on Fimbel’s keylogger, to collect input from any application while removing limitations on the amount of data that can be collected. As Fimbel’s keylogger produces keystroke data files in TSV formats, a converter was necessary to ready these files for the existing backend PKBS tools (feature extraction and classification). As such, this new system relies on a Javabased converter application to parse the keylogger recordings and produce files in Extensible Markup Language (XML) format. This format fits the input requirements of the Feature Extractor tool (Figure 2).

2. Fimbel’s Basic Keylogger Keyloggers have been used in many different environments to record data from a user at a terminal or workstation. A keylogger is a type of surveillance software or hardware that can record every single keystroke a user makes with a keyboard to a log file [6]. The log file can then be sent to a security analyst for inspection, or the file can be used as spyware and the data can be sent to a hacker. In malicious applications, keyloggers are embedded into spyware applications that may compromise a user’s sensitive information and identity by recording account passwords and transaction data. While the keylogger used in this study is not designed for spyware purposes, it does record every keystroke while the data session is active. Eric J Fimbel, a native of Venezuela, has created the Basic Keylogger that records mouse and keyboard events regardless of any application that might be running in parallel [5]. Developed in Python, this keylogger is an evaluation tool for the study of human-computer interaction and was not created for malicious intent [5]. In Fimbel’s keylogger, events are stored in memory during the recording and written to a file at the end of the session. The keylogger produces two data files: a KEY log and a KPC log. The former records input events, such as pressing keys, releasing keys, and mouse movements. The latter records operations, which are more “concise than input events and show what a user is doing” [5]. Such operations include typing keys, pointing movements, and mouse clicks. Log files are stored in tab-separated values format (TSV), which is easy to view in any spreadsheet application. Fimbel’s Basic Keylogger is of important use to the PKBS, mostly due to its ability to record events regardless of what application(s) the user may be using. The KPC log can be used to track the number of operations per task, execution time of an operation, length of a mouse pointer movement, typing rates, and other common mouse gestures. The key log aids in analyzing mouse trajectories and kinematics as well as mouse click densities, all of which can be used to analyze the physical motion of a user’s hand or finger.

4. Methodology The overall system consists of the components: • Keylogger data collection process • Data converter • PKBS backend processor • ROC curve generator

following

4.1. Keylogger Data Collection Process Our experiment for the Keylogger Keystroke Biometric System uses Fimbel’s basic keylogger to collect samples. However, a utility program was developed to control starting and stopping the keylogger as well as transmitting samples to a centralized server. This utility program also allows tag information to be specified, such as user name and sample type (i.e., Word-processing, Spreadsheet, Browser, Open, etc.). This tag information, along with a time stamp, is used to uniquely describe and identify samples. For example, if a user is generating a sample from a multiple application environment—such as surfing the Internet, working on a spreadsheet, checking e-mail, creating a PowerPoint presentation, etc.—the user would choose Open as the sample type.

2

Figure 2. Pace Keystroke Biometric System (New Additions) [2]

The experiment focused on collecting Microsoft Excel and Web browsing data samples. A standard Excel template was used for entering numeric data and the keystrokes and mouse movement were captured using the Fimbel’s basic keylogger. The recording session begins by launching the utility program, which requires first and last name and application type. Since our experiment focused on capturing Excel and Web data samples, Spreadsheet and Browser were selected for the application types. Clicking the Start button launches the keylogger as indicated by the appearance of a blue icon in the taskbar. The user proceeds by entering the required data in the Excel template (spreadsheet experiment) or searching for directions or recipes in a Web browser (web data experiment). Once finished, the user clicks on the Stop button which closes the keylogger and prevents any more data from being captured. The last step in the process is the transmission of the keylogger files. When the user clicks on the Transmit button, two actions occur. First, the key_log.tsv and kpc_log.tsv files are renamed using the user’s name, application type (Spreadsheet and Browser), a time stamp, and file name (KEY and KPC) to indicate the proper log. An example of the sample output files are

Nagesar_Padma_KEY_Spreadsheet_2011-11-05-12-3923.txt and Nagesar_Padma_KEY_Browser_2011-11-2820-24-25.txt. The file name and extension were changed in order to easily identify each user file (Nagesar_Padma_KEY and Nagesar_Padma_KPC) and to provide an easy way of opening the file in Notepad, thus the extension, .txt. Figure 3 shows the first step for collecting the data samples via the Fimbel Keylogger.

Figure 3. Fimbel Keylogger

3

off the process to convert and produce the output file. Figures 4 and 5 show the converter procedure from the pre-and post-processing stages.

4.2. Data Converter All the sample files are converted via a new improved Java-based converter to produce XML files that are compatible with the Pace Keystroke Biometric System. The new converter eliminates the login prompt that was part of the old converter. Instead it parses the user’s first and last name from the KPC file name generated by the Fimbel Keylogger, shown in Figure 4. The user name is important because it distinguishes the samples for each person in the Feature Vector file, which is generated during feature extraction.

4.3. PKBS Backend Processor PKBS backend processing consists of feature extraction and user authentication. The XML files are processed by the feature extractor to generate a single feature vector file. The feature vector file is then split into two files, one for testing samples and the other for training samples. The testing and training samples are then passed to the BAS authentication system to classify and determine the performance results. The dichotomy model is applied during this process which generates metadata files consisting of intra-class and inter-class sizes from the test and train samples. The subsequent results obtained from this process are in the form of False Acceptance and False Rejection Rates (FAR and FRR). For authentication (verification), a vector-difference model transforms a multi-class problem into a two-class problem. The resulting two classes are “within-class (intra-person), you are authenticated” and “between-class (inter-person), you are not authenticated.” This is a strong inferential statistics method found to be particularly effective for multidimensional feature-space problems [9].

First & Last Name parsed from KPC file

Figure 4. New Converter (Pre-Processing)

f2

d32

d33

δ(d1,2 ,d1,3)

Figure 5. New Converter (Post-Processing)

δf2

d31

d21

δ(d1,3 ,d2,1)

d22

d1,2

The new converter displays the number of keystrokes and anomalies, as shown in Figure 5. The anomalies refer to the keystrokes the converter could not successfully convert. An anomaly file is automatically generated with any “bad” keystrokes and is used for debugging purposes. The original converter was refactored to address some conversion problems and to improve maintainability. These modifications include the use of the KPC log file as the main driver for the output. This is an easier file to parse and originally was discarded because it was thought to be unable to include keystroke release times. The KEY log file is still used, but mainly as a lookup to obtain data elements not found in the KPC file, such as scan and key codes. Including the use of the KPC file eliminated the “Out of Bounds” error that would occasionally occur during processing. The converter program is made up of a series of method calls that display a user dialog to capture the name and location of the input and output files, obtain the user name (if any) from the input file name, parse the keystroke data, and generate the converted output into the appropriate XML format. Once the input and output paths are specified, clicking on the Convert to XML button fires

d23 d1,3 d1,1

δ(d1,2 ,d1,3)

δ(d1,3 ,d2,1) f1

(a) Feature space

δf1 (b) Feature-difference space

Figure 6. Transformation from feature space (a) to feature distance space (b), adapted from [9]

To explain the dichotomy transformation process, take an example of three people {P1,P2,P3} where each person supplies three biometric samples. Figure 6(a) plots the biometric sample data for these three people in twodimensional feature space. This feature space is transformed into a feature-difference space by calculating vector distances between pairs of samples of the same person (intra-person distances, denoted by x⊕) and distances between pairs of samples of different people (inter-person distances, denoted by x∅). Let dij represent the individual feature vector of the ith person’s jth biometric sample, then x⊕ and x∅ are calculated as follows:

4

𝑥⨁ = �𝑑𝑖𝑗 − 𝑑𝑖𝑘 � where 𝑖 = 1 to 𝑛, and 𝑗, 𝑘 = 1 to 𝑚, 𝑗 ≠ 𝑘 𝑥⊘ = �𝑑𝑖𝑗 − 𝑑𝑘𝑙 � where 𝑖, 𝑘 = 1 to 𝑛, 𝑖 ≠ 𝑘 and 𝑗, 𝑙 = 1 to 𝑚 (1)

where n is the number of people, m is the number of samples per person, and the absolute value is of the elements of these vectors. Figure 6(b) shows the transformed feature distance space for the example problem. If n people provide m biometric samples each, the numbers of intra-person and inter-person distance samples, respectively, are [9]: 𝑛⊕ =

𝑚×(𝑚−1)×𝑛 2

, 𝑛⊘ = 𝑚 × 𝑚 ×

𝑛×(𝑛−1) 2

2.

100 text-input samples from previous research [8] and the above 100 spreadsheet data-entry samples. 3. 100 Web browsing data samples, 10 samples from 10 users Weak training means the system was not trained on the users supplying the test samples, while strong training means the system was trained on the users supplying the test samples. For the first experiment, the training and testing was performed on spreadsheet data. The samples were taken using an Excel template developed for this experiment, shown in Figure 7. Excel Template Developed for Experiment. Whole numerical values in the thousands were entered in the green cells while two decimal-point values were entered in the pink cells. Additionally, three journal entries were specified which included a brief text description. The samples were converted into Extensible Markup Language (XML) files using the new converter described in section 4.2 and shown in Figures 4 and 5. The XML files were processed through the Feature Extractor to produce a single feature vector file. The feature vector file was split into two files (test and train). The test file contained 50 records for the team members and the train file contained 50 records for the non-team members. The test and train files were passed to the Biometric Authentication System to obtain the performance results. In order to authenticate and validate the accuracy of the results, the records for the test and train files were reversed (the train file consisted of the fifty member records and the test file consisted of the fifty non-team member records) and rerun through the BAS. In the second experiment, the system was trained on text data and tested on spreadsheet data, and then trained on spreadsheet data and tested on text data. The text data consisted of 100 text-input samples from previous research [8]. The spreadsheet data consisted of 100 spreadsheet samples obtained from experiment 1. The same conversion process was applied to produce XML files which were then passed to the Pace Keystroke Biometric System backend for feature extraction and classification. The feature vector file produced from the feature extraction was divided into 100 spreadsheet records for the testing and 100 text-input records for training and results were obtained. The files were reversed (test file contained 100 text-input records and the train file contained 100 spreadsheet records) and re-ran through the BAS to validate the performance. For the third experiment, training and testing was performed on Web browsing data. The samples were taken while the users browsed the Internet for ten to fifteen minutes. The same conversion process was applied to produce XML files which were then passed to the Pace Keystroke Biometric System backend for feature extraction and classification. The feature vector file produced from the feature extraction was divided into 50 team-member records for testing and 50 non-team

(2)

In the authentication process, a user’s keystroke sample requiring authentication is first converted into a feature vector. The difference between this feature vector and an earlier-obtained enrollment feature vector from this user is computed, and the resulting difference vector is classified as within-class (intra-person) for authentication or between-class (inter-person) for nonauthentication. The k-nearest-neighbor method performs this classification by comparing this feature-difference vector against those in the training set. To obtain system performance we simulate the authentication process of many true users trying to get authenticated and of many imposters trying to get authenticated as other users. This is done by using the numbers of the inter- and intra-person distances explained above.

4.4. ROC Curve Generator The authentication results are used by the Receiver Operating Characteristic (ROC) Curve generation process to provide further analysis. This process generates a graphical curve which shows the trade-off between the FAR and FRR at different threshold operating points. Typically, the FAR and FRR have an inverse relationship, meaning one will increase while the other decreases (and vice versa) at the different operating points. The data for all three experiments includes performance values and average rates for false acceptance and false rejection. The false rejection rate (FRR), also known as a Type I error, indicates the rate at which a system fails to verify or identify an authorized person [4]. It measures the likelihood that a system will incorrectly reject an authorized user. Conversely, the false acceptance rate (FAR) is known as a Type II error and is the instance of a system incorrectly verifying an unauthorized user [3]. This error is an extremely serious security breach as unauthorized users are inadvertently allowed access to a system.

5. Experimental Design Three weak training experiments were conducted with the following data: 1. 100 spreadsheet data-entry samples, 10 samples from 10 users 5

member records for training. The files were reversed (test file contained 50 non-team member records and the train file contained 50 team member records) and re-ran through the BAS to validate the performance. The results from all three experiments were compared to ascertain the False Acceptance Rate (FAR) and False Rejection Rate (FRR). The next section provides more detailed information and explanation about these results.

first experiment and 100 text-input keystroke samples from an earlier research [8]. Training on text data and testing on spreadsheet data yielded averages of 26.49%, 11.25%, and 87.39% for the FRR, FAR, and performance rates, respectively (Table 3). Table 2. Summarized Results for Reversed Experiment 1 kNN Train 225-1000 1 225-1000 3 225-1000 5 225-1000 7 225-1000 9 Average:

Test 225-1000 225-1000 225-1000 225-1000 225-1000

FRR 13.43% 17.59% 16.20% 15.74% 15.28% 15.65%

FAR 12.29% 9.02% 9.02% 9.22% 8.92% 9.69%

Perf. 87.51% 89.47% 89.71% 89.63% 89.96% 89.26%

Reversing the experiment (training on spreadsheet data and testing on text data) yielded rather low performance rates and an expectedly high average for the false acceptance rate (Table 4). This is not unusual as the current system was designed to operate on text data. Since the spreadsheet samples consist of mostly numeric input, they are not a good candidate for testing against alphabetic text-input keystrokes. However, with the addition of feature measurements specific to numeric entry, we would expect to see improvement in this area. Table 3. Summarized Results for Experiment 2 kNN Train 450-4500 1 450-4500 3 450-4500 5 450-4500 7 450-4500 9 Average:

Figure 7. Excel Template Developed for Experiment

6. Results and Discussions

kNN Train 450-4500 1 450-4500 3 450-4500 5 450-4500 7 450-4500 9 Average:

FRR 13.33% 16.44% 16.00% 16.44% 16.00% 15.64%

FAR 14.15% 9.07% 10.58% 11.16% 11.29% 11.25%

Perf. 84.91% 88.99% 88.06% 87.52% 87.49% 87.39%

FAR 25.50% 19.10% 21.00% 21.70% 22.00% 21.86%

Test 450-4500 450-4500 450-4500 450-4500 450-4500

FRR 0.22% 0.89% 0.44% 0.00% 0.22% 0.35%

FAR 83.62% 78.44% 81.00% 81.29% 81.18% 81.11%

Perf. 23.96% 28.61% 26.32% 26.10% 26.18% 26.23%

The third experiment included 50 samples from five team members for testing and 50 samples from five nonteam members for training. Table 5 shows the average FRR, FAR, and performance rates as 29.80%, 16.83%, and 81.22%, respectively. As the system is not designed yet for mouse movement and click features (typical with web browsing activities), these results are better than expected.

Table 1. Summarized Results for Experiment 1 Test 225-1000 225-1000 225-1000 225-1000 225-1000

FRR 24.72% 30.84% 25.85% 26.08% 24.94% 26.49%

Table 4. Summarized Results for Reversed Experiment 2

The first experiment for the Excel data sample files produced results that are similar to a previous study [2]. This experiment included 50 samples from five non-team members for training and 50 samples from five team members for testing (10 users each producing ten samples). Table 1 shows the average false rejection rate as 15.64%, the average false acceptance rate as 21.86%, and the average performance as 79.28%.

kNN Train 225-1000 1 225-1000 3 225-1000 5 225-1000 7 225-1000 9 Average:

Test 450-4500 450-4500 450-4500 450-4500 450-4500

Perf. 76.73% 81.39% 79.92% 79.27% 79.10% 79.28%

Table 5. Summarized Results for Experiment 3

Reversing the experiment (50 samples from team members for training and 50 samples from non-team members for testing) yielded improved results (Table 2). For this experiment, the average FRR, FAR, and performance rates were 15.65%, 9.69%, and 89.26%, respectively. The second experiment used the 100 samples from the

kNN Train 225-1000 1 225-1000 3 225-1000 5 225-1000 7 225-1000 9 Average:

6

Test 225-1000 225-1000 225-1000 225-1000 225-1000

FRR 21.29% 33.17% 33.66% 30.20% 30.69% 29.80%

FAR 25.90% 13.20% 14.66% 15.35% 15.05% 16.83%

Perf. 74.86% 83.51% 82.20% 82.20% 83.37% 81.22%

Reversing the experiment (50 samples from team members for training and 50 samples from non-team members for testing) yielded similar results (Table 6). The average FRR, FAR, and performance rates were 48.62%, 11.72%, and 81.50%, respectively.

The first and second experiments yielded lower EER with favorable values for the FRR and FAR rates. ROC Curve for Experiment 2 0.50

Table 6. Summarized Results for Reversed Experiment 3 Test 225-1000 225-1000 225-1000 225-1000 225-1000

FRR 38.67% 54.22% 48.89% 49.78% 51.56% 48.62%

FAR 21.80% 9.00% 10.00% 9.30% 8.50% 11.72%

Perf. 75.10% 82.69% 82.86% 83.27% 83.59% 81.50%

0.40

False Acceptance Rate

kNN Train 225-1000 1 225-1000 3 225-1000 5 225-1000 7 225-1000 9 Average:

0.45

A receiver operating characteristic (ROC) curve can be plotted to analyze the performance of the Pace Keystroke Biometric System (PKBS). The curve for the first experiment is shown in Figure 8. With 1,276 plotted data points, the approximate equal error rate (EER) where the FAR and FRR are equal to 17.8%.

0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50

ROC Curve for Experiment 1

False Rejection Rate

0.50

Figure 9. ROC Curve for Experiment 2, EER ≅ 18.6%

0.45

ROC Curve for Experiment 3

0.35 0.50 0.30 0.45 0.25 0.40 0.20

False Acceptance Rate

False Acceptance Rate

0.40

0.15 0.10 0.05 0.00 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50

False Rejection Rate

0.35 0.30 0.25 0.20 0.15 0.10 0.05

Figure 8. ROC Curve for Experiment 1, EER ≅ 17.8%

0.00 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50

The 45-degree line on the ROC curve graphs helps identify the equal error rate. Lower equal error rates show better accuracy in the system. The equal error rate occurs at the intersection of the 45-degree line and the ROC curve where the false acceptance rate is equal to the false rejection rate. The curve for the second experiment is shown in Figure 9. The approximate EER for the ROC curve consisting of 5,051 data points is approximately 18.6%. The first experiment yielded a lower EER with favorable values for the false rejection and false acceptance rates. The ROC curve for the third experiment is shown in Figure 10. The approximate EER for the ROC curve consisting of 1,276 data points is approximately 19.9%.

False Rejection Rate

Figure 10. ROC Curve for Experiment 3, EER ≅ 19.9%

The ROC curves and performance results may reflect some discrepancies related to the generations of the input samples. Particularly, the second experiment used archived text samples that were generated from the old input system while the spreadsheet data used in the same experiment were generated from each of these input systems in the same experiment. Some discrepancies were introduced with the Fimbel keylogger in the form of scan codes showing up in the 7

KPC file where they previously only existed in the KEY file. This was the result of a recent update from Eric Fimbel. While this update was ultimately not used in these experiments, any future update that results in changes to the keylogger output formats will need to be addressed for future trials. Some web samples, mainly the longer ones (generated over a 15 minute time frame), did not transmit to the Pace Vulcan server; this will also need to be explored in any future trials as well.

17). History of Fingerprinting. [Online]. Available: http://www.odec.ca/projects/2004/fren4j0/public_htm l/history_of_fingerprinting.htm. [8] H. Poorshatery et al., “Keystroke Biometric System Test Taker Setup and Data Collection,” Pace University Seidenberg School of CSIS, White Plains, NY, December 2010. [9] Yoon, S., Choi, S-S., Cha, S-H., Lee, Y., & Tappert, C.C. (2005). On the individuality of the iris biometric. Proc. Int. J. Graphics, Vision & Image Processing, 5(5), 63-70.

7. Conclusions The Pace Keystroke Biometric System (PKBS) is an excellent application that can provide valuable studies on keystroke biometrics. The new system takes full advantage of using an open source keylogger as data samples can now be recorded from any application on any operating system. Although the introduction of the open source keylogger requires extra preparation before the feature extractor and analysis tools can be used, the performance (FAR, FRR, EER, etc.) from the experiments conducted within are nonetheless favorable and suggests that the keylogger integrates well into the Keystroke Biometric System. A concern from the experiments conducted in this study is the impact non-uniform backgrounds (i.e., different computing environments) can have on the generation of samples and subsequent results. The creation of a virtual machine could help minimize these differences and possibly consolidate the number of steps required to perform experiments. In this way, the Pace Keystroke Biometric System can continue to grow and be a powerful biometric analysis tool in a multi-platform environment.

8. References [1] Biometrics, Wikipedia (2011, October 17). Biometrics. [Online]. Available http://en.wikipedia.org/wiki/Biometrics. [2] J. Deluca et al., “A System-Wide Keystroke Biometric System,” Pace University Seidenberg School of CSIS, White Plains, NY, May 2011. [3] False Acceptance (Type II Error). [Online]. Available http://searchsecurity.techtarget.com/definition/falseacceptance. [4] False Rejection (Type I Error). [Online]. Available http://searchsecurity.techtarget.com/definition/falserejection. [5] E. J. Fimbel (2011, March 01). Basic Keylogger. [Online]. Available https://sites.google.com/site/basiclabbook/keyloggerbasiclabbook. [6] D. Kim and M. Solomon, “Malicious Attacks, Threats, and Vulnerabilities” in Fundamentals of Information Systems Security. Sudbury, MA: Jones & Bartlett Learning, 2012, ch. 3, p. 90. [7] Online Digital Education Connection (2011, October

8