Using Simulation to Explore the Reliability, Fairness

0 downloads 0 Views 283KB Size Report
intercm
Using Simulation to Explore the Reliability, Fairness, and Validity of Statistical Models Daniel B. Wright Alder Graduate School of Education Abstract ** Keywords: simulation, statistical models, graphical models, Standards

Different statistical models are appropriate for different situations. Therefore it is necessary to assess the accuracy of a statistical model for different data models. According to the American Educational Research Association et al. (2014), there are three central pillars for a test to be accurate: reliability, fairness, and validity. This tutorial describes how this can be done and provides examples for each of these pillars. The critical and most difficult aspect of statistical simulations is constructing a plausible range of data models that are realistic enough to this test if the statistical model is reliable, fair, and valid enough to be used. For current purposes, statistical models are algorithms that transform data into a few summary statistics. The data models are descriptions of how the data may arise. These need to be precise enough so that simulated data can created. These can be complex and realistic, or simple to demonstrate how some aspect of the data affects the statistical models’ accuracy. The don’t over-kill with precision, and be pragmatic. If your estimation is least squares versus a complex Bayesian method. Speeding up. Somewhat platform/package dependent, but in compile code, parallel, if slow trim stuff like checking for missing values Example: Estimating How Much Somebody Has Learned An important question for psychologists, cognitive scientists, education researchers/policy makers, and others is Reliability dispersion within replications. Often extremes are of interest, so look at distributions.

This research is supported by the Chan Zuckerberg Initiative. email: [email protected] or [email protected].

SIMULATION

2

Suppose that you have student scores from three annual tests, and for illustration, let’s say that the data are drawn for normal distributions with the same mean and standard deviation, and that there are no missing values. Call these t1j ,t2j , and t3j for the j students, or alternatively tij where i indexes the test number. There are different methods that could be used to estimate growth for each student. The simplest mathematically is the slope for the student, which with three points is: slope j = (t3j −t1j )/2. The student can calculate this so if some fancy statistical algorithm produces a different estimate it will be suspect. One more complex procedure that is sometimes suggested for measuring growth is estimating a multilevel model with a random slope *cites*.

suppressPackageStartupMessages(require(lme4)) options(warn=-1) replics