Statistical Inference Bibliography 1920-Present 1. Pearson, K. (1920 ...

1 downloads 67 Views 112KB Size Report
Pearson, K. (1920) “The Fundamental Problem in Practical Statistics.” Biometrika ... Fisher, R. (1955) “Statistical Methods and Scientific Induction.” Journal of the ...
StatisticalInferenceBiblio.pdf

© 2013, Timothy G. Gregoire, Yale University

http://environment.yale.edu/profile/gregoire/bibliographies Last revised: December 2014

Statistical Inference Bibliography 1920-Present . 1. Misc. 2 On Bias and Randomness. 2. Misc. 3 Lopsided reasoning. 3. Pearson, K. (1920) “The Fundamental Problem in Practical Statistics.” Biometrika, 13(1): 116. 4. Edgeworth, F.Y. (1921) “Molecular Statistics.” Journal of the Royal Statistical Society, 84(1): 71-89. 5. Fisher, R. A. (1922) “On the Mathematical Foundations of Theoretical Statistics.” Philosophical Transactions of the Royal Society of London, Series A, Containing Papers of a Mathematical or Physical Character, 222: 309-268. 6. Neyman, J. and E. S. Pearson. (1928) “On the Use and Interpretation of Certain Test Criteria for Purposes of Statistical Inference: Part I.” Biometrika, 20A(1/2): 175-240. 7. Fisher, R. A. (1933) “The Concepts of Inverse Probability and Fiducial Probability Referring to Unknown Parameters.” Proceedings of the Royal Society of London, Series A, Containing Papers of Mathematical and Physical Character, 139(838): 343-348. 8. Buchanan-Wollaston, H. J. (1935) “Statistical Tests”, Nature v136: 182-183. 9. Fisher, R. A. (1935) “The Logic of Inductive Inference.” Journal of the Royal Statistical Society, 98(1): 39-82. 10. Fisher, R. A. (1936) “Uncertain inference.” Proceedings of the American Academy of Arts and Sciences, 71: 245-258. 11. Neyman, J. (1937). Outline of a theory of statistical estimation based on the classical theory of probability. Philosophical Transactions of the Royal Society of London, Series A. 236: 333-380. 12. Berkson, J. (1942) “Tests of Significance Considered as Evidence.” Journal of the American Statistical Association, 37(219): 325-335. 13. Berkson, J. (1942) “Tests of Significance Considered as Evidence.” Reprinted in International Journal of Epidemiology (from 1942 JASA article) 32:687-691.

StatisticalInferenceBiblio.pdf

© 2013 Timothy G. Gregoire, Yale University

14. Barnard, G. A. (1949) “Statistical Inference.” Journal of the Royal Statistical Society, Series B (Methodological), 11(2): 115-149. 15. Fisher, R. (1955) “Statistical Methods and Scientific Induction.” Journal of the Royal Statistical Society, Series B (Methodological), 17(1): 69-78. 16. Pearson, E. S. (1955) “Statistical Concepts in their Relation to Reality.” Journal of the Royal Statistical Society, Series B (Methodological),17(2): 204-207. 17. Yates, F. (1955) “Discussion on the Paper by Dr. Box and Dr. Anderson.” Statistical Inference, Robustness, and Modeling Strategy, JRSS-B, 17(1): 31. 18. Barlett, M.S. (1956) Comment on Sir Ronald Fisher’s Paper: “On a Test of significance in Pearson’s Biometrika Tables (No. 11)”. Journal of the Royal Statistical Society Series B 18(2): 295 – 296. 19. Fisher, R. (1956) On a Test of significance in Pearson’s Biometrika Tables (No. 11). Journal of the Royal Statistical Society Series B 18(1): 56 – 60. 20. Neyman, J. (1956) Note on an Article by Sir Ronald Fisher. Journal of the Royal Statistical Society Series B 18(2): 288 – 294. 21. Welch, B.L. (1956) Note on some criticisms made by Sir Ronald Fisher. Journal of the Royal Statistical Society Series B 18(2): 297 – 302. 22. Lindley, D. V. (1957). A statistical paradox. Biometrika 44(1/2) 187-192. 23. Cox, D. R. (1958) “Some Problems Connected with Statistical Inference.” Annals of Mathematical Statistics, 29(2): 357-372. 24. Good, I. J. (1958) “Significance Tests in Parallel and In Series.” Journal of the American Statistical Association, 53: 799-813. 25. Eysenck, H. J. (1960) “The Concept of Statistical Significance and the Controversy about One-tailed Tests”, Psychological Review 67(4) 269-271. 26. Natrella, M. G. (1960) “The Relation Between Confidence Intervals and Tests of Significance.” The American Statistician, 14: 20-22 & back cover. 27. Rozeboom, W. W. (1960) “The Fallacy of the Null-Hypothesis Significance Test.” Psychological Bulletin, 57(5): 416-428. 28. Neyman, J. (1961) “Silver Jubilee of My Dispute with Fisher.” Journal of the Operations Research Society of Japan, 3(4): 145-154.

2

StatisticalInferenceBiblio.pdf

© 2013 Timothy G. Gregoire, Yale University

29. Pratt, J. W. (1961) “Testing Statistical Hypotheses.” Journal of the American Statistical Association, 56(293): 163-167. 30. Barnard, G. A., G. M. Jenkins, & C. B. Winsten. (1962) “Likelihood Inference and Time Series.” Journal of the Royal Statistical Society, Series A (General), 125(3): 321-372. 31. Birnbaum, A. (1962) “On the Foundations of Statistical Inference.” Journal of the American Statistical Association, 57(298): 269-306. 32. Pearson, E. S. (1962) “Some Thoughts on Statistical Inference.” Annals of Mathematical Statistics, 33(2): 394-403. 33. Fraser, D. A. S. (1963) “On the Sufficiency and Likelihood Principles.” Journal of the American Statistical Association, 58(303): 641-647. 34. Kendall, M. G. (1963) “Ronald Aylmer Fisher, 1890-1962.” Biometrika, 50(1/2):1-15. 35. Platt, J. R. (1964) “Strong Inference.” Science, 146(3642): 347-353. 36. Dempster, A. P. and M. Schatzoff. (1965) “Expected Significance Level as a Sensitivity Index for Test Statistics.” Journal of the American Statistical Association, 60(310): 420-436. 37. Pratt, J. W. (1965) “Bayesian interpretation of standard inference statements.” Journal of the Royal Statistical Society 27(2) 169-203 38. Cornfield, J. (1966) “Sequential Trials, Sequential Analysis and the Likelihood Principle.” The American Statistician, 20: 18-23. 39. Cutler, S. J., et al. (1966) “The Role of Hypothesis Testing in Clinical Trials.” Journal of Chronic Disease, 19: 857-882. 40. Selvin, H. C. and Stuart, A. (1966) “Data-dredging Procedures in Survey Analysis.” The American Statistician, 20:20-23. 41. Royall, R. (1968). “An old approach to finite population sampling theory.” Journal of the American Statistical Association 63: 1269-1279. 42. Seeger, P. (1968) “A Note on a Method for the Analysis of Significances en masse” Technometrics 10(3): 586-593. 43. Edwards, A. W. F. (1969) “Statistical Methods in Scientific Inference.” Nature, 222(June): 1233-1237. 44. Tukey, J. W. (1969) Analyzing Data: Sanctification or Detective Work? American Psychologist 83-91.

3

StatisticalInferenceBiblio.pdf

© 2013 Timothy G. Gregoire, Yale University

45. Edwards, A. W. F. (1970) “Likelihood.” Nature, 227(July): 92. 46. Durbin, J. (1970) “On Birnbaum’s Theorem on the Relation Between Sufficiency, Conditionality and Likelihood.” Journal of the American Statistical Association, 65(329): 395-398. 47. Kyburg, Jr. H.E. (1971) “Probability and informative inference”. Proceedings of the symposium on the Foundations of Statistical Inference prepared under the auspices of the Rene Descartes Foundation and held at the Department of Statistics, University of Waterloo, Ontario, Canada, from March 31 to April 9, 1970: 82 – 107. 48. Neyman, J. (1971) “Foundations of Behavioristic statistics”. Proceedings of the symposium on the Foundations of Statistical Inference prepared under the auspices of the Rene Descartes Foundation and held at the Department of Statistics, University of Waterloo, Ontario, Canada, from March 31 to April 9, 1970:1-19. 49. Rao, C.R. (1971) “Some aspects of statistical inference in problems of sampling from finite populations”. Proceedings of the symposium on the Foundations of Statistical Inference prepared under the auspices of the Rene Descartes Foundation and held at the Department of Statistics, University of Waterloo, Ontario, Canada, from March 31 to April 9, 1970: 117 – 202. 50. Leamer, E. E. (1974) “False Models and Post-Data Model Construction.” Journal of the American Statistical Association, 69(345): 122-131. 51. Spielman, S. 1974. Philosophy of Science: The Logic of Tests of Significance. 52. Kempthorne, O. (1975) “Inference from Experiments and Randomization.” In A Survey of Statistical Design and Linear Models, J. N. Srivastava, ed., North-Holland Publishing Company. Pages 303-331. 53. Robinson, G. K. (1975). Some counterexamples to the theory of confidence intervals. Biometrika 62(1) 155-161. 54. Joshi, V. M. (1976). A note on Birmbaum’s theory of the likelihood principle. Journal of the American Statistical Association. 71: 345-346. 55. Cox, D. R. (1977) “The Role of Significance Tests.” Scandinavian Journal of Statistics, 4: 49-70. 56. Guttman, L. (1977) “What is Not What in Statistics.” The Statistician, 26(2): 81-107. 57. Kiefer, J. (1977) “The Foundations of Statistics – Are there any?” Synthese 36: 161 – 176. 58. Giere, R.N. (1977) “Allan Birnbaum’s conception of statistical evidence”. Synthese 36: 5-13.

4

StatisticalInferenceBiblio.pdf

© 2013 Timothy G. Gregoire, Yale University

59. Neyman, J. (1977) “Frequentist probability and frequentist statistics”. Synthese 36: 97 – 131 60. Robinson, G. K. (1977). Conservative statistical inference. Journal of the Royal Statistical Society, Series B. 39: 381-386. 61. Smith, C. A. B. (1977) “The analogy between decision and inference”. Synthese 36: 71 – 85. 62. Carver, R. P. (1978) “The Case Against Statistical Significance Testing.” Harvard Educational Review, 48(3): 378-398. 63. Eberhardt L.L. (1978) “Appraising Variability in Population Studies”. Journal of Wild Life Management, 42(2): 207-238. 64. Good, I. J. (1980) “The diminishing significance of a p-value as the sample size decreases.” Journal of Statistical Computation & Simulation, 11: 307-313. 65. Dolby, G. R. (1982) “The Role of Statistics in the Methodology of the Life Sciences.” Biometrics, 38: 1069-1083. 66. Good, I. J. (1982) “Standardized tail-area probabilities.” Journal of Statistical Computation and Simulation, 16: 65-75. 67. Schweder, T. and E. Spjøtvoll. (1982). “Plots of P-values to Evaluate Many Tests Simultaneously.” Biometrics, 69(3): 493-502. 68. Leamer, E. E. (1983) “Let’s Take the Con out of Econometrics.” The American Economic Review, 73(1): 31-43. 69. Leamer, E. and H. Leonard. (1983) “Reporting the Fragility of Regression Estimates.” The Review of Economics and Statistics, 65(2): 306-317. 70. Good, I. J. (1984) “How Should Tail-Area Probabilities be Standardized for Sample Size in Unpaired Comparisons?” C191 in Journal of Statistical Computation and Simulation, 19: 174. 71. Mayo, D.G. (1985) “Behavioristic, Evidentialist and learning models of statistical testing”. Philosophy of Science 52: 493 – 516. 72. Salburg, D.S. (1985) “The religion of statistics as practiced in Medical journals”. The American Statistician 39(3): 220 – 223. 73. Thompson, W. A. Jr. (1985). Optimal significance procedures for simple hypotheses. Biometrika 72(1) 230-232.

5

StatisticalInferenceBiblio.pdf

© 2013 Timothy G. Gregoire, Yale University

74. Berger, J. O. (1986) “Are P-Values Reasonable Measures of Accuracy?” In Pacific Statistical Congress, I. S. Francis et al., eds., Elsevier Science Publishers, the Netherlands. Pages 21-27. 75. Cox, D. R. (1986) “Some General Aspects of the Theory of Statistics.” International Statistical Review, 54(2): 117-126. 76. Fleiss, J. L. (1986) “Significance Tests Have a Role in Epidemiologic Research: Reactions to A. M. Walker.” American Journal of Public Health, 76(5): 559-560. 77. Fleiss, J. L. (1986) “Letters to the Editor: Confidence Intervals vs Significance Tests: Quantitative Interpretation.” American Journal of Public Health, 76(5): 587. 78. Hall, P. and B. Selinger. (1986) “Statistical Significance: Balancing Evidence Against Doubt.” Australian Journal of Statistics, 28(3): 354-370. 79. Johnstone, D. J. (1986). Tests of significance in theory and practice. The Statistician 35, 491504. 80. Royall, R. M. (1986) “The Effect of Sample Size on the Meaning of Significance Tests.” The American Statistician, 40(4): 313-315. Also: Bailey, K. R. (1987) “Comment on Royall (1986).” The American Statistician, 41(3): 245-246. 81. Walker, A. M. (1986) “Reporting the Results of Epidemiologic Studies.” American Journal of Public Health, 76(5): 556-558. 82. Warren, W.G. (1986) “On the presentation of statistical analysis: reason or ritual”. Canadian Journal of Research 16: 1185 – 1191. 83. Berger, J. O. and M. Delampady. (1987) “Testing Precise Hypotheses.” Statistical Science, 2(3): 317-352. 84. Berger, J. O. and T. Sellke. (1987) “Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence.” Journal of the American Statistical Association, 82(397): 112139. 85. Casella, G. and R. L. Berger. (1987) “Reconciling Bayesian and Frequentist Evidence in the One-Sided Testing Problem.” Journal of the American Statistical Association, 82(397): 106135. 86. Chow, S.L. (1987) “Science, Ecological validity and experimentation”. Journal for the Theory of Social Behaviour 17: 181 – 194. 87. Hill, B. M. (1987) “The validity of the likelihood principle.” The American Statistician 41(2) 95-100.

6

StatisticalInferenceBiblio.pdf

© 2013 Timothy G. Gregoire, Yale University

88. Poole, C. (1987) “Beyond the Confidence Interval.” American Journal of Public Health, 77(2): 195-199. 89. Thompson, W. D. (1987) “Statistical Criteria in the Interpretation of Epidemiologic Data.” American Journal of Public Health, 77(2): 191-194. 90. Berger, J. O. and D. A. Berry. (1988) “Statistical Analysis and the Illusion of Objectivity.” American Scientist, 76(2): 159-165. 91. Goodman, S. N. and R. Royall. (1988) “Evidence and Scientific Research.” American Journal of Public Health, 78(12): 1568-1574. 92. Schweder, T. (1988) “A Significance Version of the Basic Neyman-Pearson Theory for Scientific Hypothesis Testing.” Scandinavian Journal of Statistics, 15: 225-242. 93. Sorić, B. (1989) “Statistical “Discoveries” and Effect-Size Estimation.” Journal of the American Statistical Association, 84(406): 608-610. 94. Vuong, Q. H. (1989) “Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses.” Econometrica, 57(2): 307-333. 95. Anscombe, F. J. (1990) “The Summarizing of Clinical Experiments by Significance Levels.” Statistics in Medicine, 9: 703-708. 96. Barnard, G. A. (1990) “Must Clinical Trials Be Large? The Interpretation of P-Values and the Combination of Test Results.” Statistics in Medicine, 9: 601-614. 97. Begg, C. B. (1990) “On Inferences from Wei’s Biased Coin Design for Clinical Trials.” Biometrika, 77(3): 467-84. 98. Cohen, J. (1990) “Things I Have Learned (So Far).” American Psychologist, 45(12): 13041312. 99. Peterman, R. M. (1990) “The Importance of Reporting Statistical Power: The Forest Decline and Acidic Deposition Example.” Ecology, 71(5): 2024-2027. 100. Rice, W. R. (1990) “A Consensus Combined P-Value Test and the Family-Wide Significance of Component Tests.” Biometrics, 46: 303-308. 101. Salsburg, D. (1990) “Hypothesis Versus Significance Testing for Controlled Clinical Trials: A Dialogue.” Statistics in Medicine, 9: 201-211. 102. Besag, J. and P. Clifford. (1991) “Sequential Monte Carlo p-values.” Biometrika, 78(2): 301-304.

7

StatisticalInferenceBiblio.pdf

© 2013 Timothy G. Gregoire, Yale University

103. Yoccoz, N. G. (1991) “Commentary: Use, Overuse, and Misuse of Significance Tests in Evolutionary Biology and Ecology.” Bulletin of the Ecological Society of America, 72(2): 106-111. 104.

Faraway, J. J. (1992) “On the Cost of Data Analysis” ???? 1(3) 213-229.

105. Goodman, S. N. (1992) “A Comment on Replication, P-Values and Evidence.” Statistics in Medicine, 11: 875-879. 106. Wright, S. P. (1992) “Adjusted P-Values for Simultaneous Inference.” Biometrics, 48: 1005-1013. 107. Freeman, P. R. (1993) “The Role of P-Values in Analysing Trial Results.” Statistics in Medicine, 12: 1443-1452. 108. Hurlbert, H. and White, M. D. (1993) “Experiments with freshwater invertebrate zooplanktivores: Quality of statistical analysis” Bulletin of Marine Science, 53(1) 128-153. 109. Lee, Y. J. and H. Quan. (1993) “P-Values After Repeated Significance Testing: A Simple Approximation Method.” Statistics in Medicine, 12: 675-684. 110. Lehmann, E. L. (1993) “The Fisher, Neyman-Pearson Theories of Testing Hypotheses: One Theory or Two?” Journal of the American Statistical Association, 88(424): 1242-1249. 111. McBride, G., J. C. Loftis, and N. C. Adkins. (1993) “What Do Significance Tests Really Tell Us About the Environment?” Environmental Management, 17(4): 423-432. 112. Wang, C. (1993) Sense and Nonsense of Statistical Inference: Controversy, Misuse, and Subtlety. Marcel Dekker, New York. 113.

Cohen, J. (1994) “The Earth is round (p < .05)”. American Psychologist 49: 997 – 1003.

114. Goodman, S. N. and J. A. Berlin. (1994) “The Use of Predicted Confidence Intervals When Planning Experiments and the Misuse of Power When Interpreting Results.” Annals of Internal Medicine, 121(3): 200-206. 115. Inman, H. F. (1994). Karl Pearson and R. A. Fisher on Statistical Tests: A 1935 Exchange from Nature. The American Statistician 48(1) 2-11. 116. Chatfield, C. (1995) Model Uncertainty, Data Mining, and Statistical Inference. Journal of the Royal Statistical Society, Series A. 158: 419-466. 117. Edwards, A. W. F. (1995) “XVIIIth Fisher Memorial Lecture Delivered at the Natural History Museum, London, on Thursday, 20th October, 1994, Fiducial Inference and the Fundamental Theorem of Natural Selection.” Biometrics, 51(3): 799-809.

8

StatisticalInferenceBiblio.pdf

© 2013 Timothy G. Gregoire, Yale University

118. Goutis, C. and G. Casella. (1995) “Frequentist Post-Data Inference.” International Statistical Review, 63(3): 325-344. 119. Johnson, D. H. (1995) “Statistical sirens: the allure of nonparametrics”. Ecology 76(6) 1998 – 2000. 120. Keuzenkamp, H. A. and J. R. Magnus. (1995) “On Tests and Significance in Econometrics.” Journal of Econometrics, 67: 5-24. 121. Keuzenkamp, H. A. and M. McAleer. (1995) “Simplicity, Scientific Inference, and Econometric Modelling.” The Economic Journal, 105: 1-21. 122. Sagan, C (1995) The Demon-Haunted World: Science as a Candle in the Dark. (see page 113 for the quote “absence of evidence is not evidence of absence.” 123. Smith, S. M. (1995) “Distribution-free and robust statistical methods: viable alternatives to parametric statistics”. Ecology 76(6) 1997 – 1998. 124. Tsou, T. and R. M. Royall. (1995) “Robust Likelihoods.” Journal of the American Statistical Association, 90(429): 316-320. 125. Dollinger, M., E. Kulinskaya & R.G. Staudte et al. (1996) “When is a p-Value a Good Measure of Evidence?” In Robust Statistics, Data Analysis, and Computer Intensive Methods, (H. Rieder, ed.), No. 109 in Lecture Notes in Statistics pp. 119-134. 126. Mislevy, R. J. (1996) “Evidence and Inference in Educational Assessment.” CSE Technical Report 414, National Center for Research on Evaluation, Standards, and Student Testing (CRESST), Graduate School of Education and Information Studies, The Regents of the University of California, Los Angeles. 127. Nester, M. R. (1996) “An Applied Statistician’s Creed.” Applied Statistics, 45(4): 401410. 128. Schervish, M. J. (1996) “P Values: What They Are and What They Are Not.” The American Statistician, 50(3): 203-206. 129. Bower, B. (1997) “Psychology’s Statistical Status Quo Draws Fire.” Science News, 151: 356-357. 130. Hayes, J. P. and R. J. Steidl. (1997) “Statistical Power Analysis and Amphibian Population Trends.” Conservation Biology, 11(1): 273-275. 131. Hung, H. M. J., et al. (1997) “The Behavior of the P-Value When the Alternative Hypothesis is True.” Biometrics, 53: 11-22.

9

StatisticalInferenceBiblio.pdf

© 2013 Timothy G. Gregoire, Yale University

132. Pruzek, R. M. (1997) “An Introduction to Bayesian Inference and its Applications.” In What if There Were No Significance Tests?, L. L. Harlow, et al., eds. Lawrence Erlbaum Associates, Publishers: Mahwah, New Jersey, & London, pages 287-318. 133. Rindskopf, D. M. (1997) “Testing “Small,” Not Null, Hypotheses: Classical and Bayesian Approaches.” In What if There Were No Significance Tests?, L. L. Harlow, et al., eds. Lawrence Erlbaum Associates, Publishers: Mahwah, New Jersey, & London, pages 319-332. 134.

Royall, R. (1997) Statistical Evidence: A Likelihood Paradigm. Chapman&Hall/CRC.

135. Thomas, L. (1997) “Retrospective Power Analysis.” Conservation Biology, 11(1): 276280. 136. Tukey, J. (1997) More Honest Foundations for Data Analysis. Journal of Statistical Planning and Inference 57: 21-29. 137. Cherry, S. (1998) “Statistical Tests in Publications of The Wildlife Society.” Wildlife Society Bulletin, 26(4): 947-953. 138. Efron, B. (1998) “R. A. Fisher in the 21st Century: Invited Paper Presented at the 1996 R. A. Fisher Lecture.” Statistical Science, 13(2): 95-122. 139. Gerard, P. D., et al. (1998) “Limits of Retrospective Power Analysis.” Journal of Wildlife Management, 62(2): 801-807. 140. Shen, W. and T.A. Louis. (1998) “Triple-goal estimates in two-stage hierarchical models.” Jorunal of the Royal Statistical Society B, 60(2): 455-471. 141. Thompson, J. R. (1998) “Invited Commentary: Re: “Multiple Comparisons and Related Issues in the Interpretation of Epidemiologic Data.” American Journal of Epidemiology, 147(9): 801-806. 142. Vieland, V. J. and S. E. Hodge. (1998) “Book Reviews: Statistical Evidence: A Likelihood Paradigm, by Richard Royall.” American Journal of Human Genetics, 63: 283289. 143. Zumbo, B. D. and A. M. Hubley. (1998) “A note on misconceptions concerning prospective and retrospective power.” The Statistician, 47(2): 385-388. 144. Donahue, R. M. J. (1999) “A Note on Information Seldom Reported Via the P Value.” The American Statistician, 53(4): 303-306. 145. Goodman, S. N. (1999) “Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy.” Annals of Internal Medicine, 130(12): 995-1004.

10

StatisticalInferenceBiblio.pdf

© 2013 Timothy G. Gregoire, Yale University

146. Goodman, S. N. (1999) “Toward Evidence-Based Medical Statistics. 1: The Bayes Factor.” Annals of Internal Medicine, 130(12): 1005-1013. 147. Johnson, D. H. (1999) “The Insignificance of Statistical Significance Testing.” Journal of Wildlife Management, 63(3): 763-772. 148. Lindsey, J. K. (1999) “Some Statistical Heresies.” The Statistician, 48(1): 1-40. 149. Perlman, M. D. and L. Wu. (1999) “The Emperor’s New Tests.” Statistical Science, 14(4): 355-381. 150. Sackrowitz, H. and E. Samuel-Cahn. (1999) “P Values as Random Variables—Expected P Values.” The American Statistician, 53(4): 326-331. 151. Stockmarr, A. (1999) “Likelihood Ratios for Evaluating DNA Evidence When the Suspect is Found Through a Database Search.” Biometrics, 55: 671-677. 152. Bayarri, M. J. and J. O. Berger. (2000) “P Values for Composite Null Models.” Journal of the American Statistical Association, 95(452): 1127-1172. 153. Robinson, A. (2000) “Slides from A. Robinson’s talk on A Jaundiced View of Hypothesis and Significance Testing.” University of Idaho. 154. Royall, R. (2000) “On the Probability of Observing Misleading Statistical Evidence.” Journal of the American Statistical Association, 95(451): 760-773. 155. Barker, L., H. Rolka, D. Rolka, C. Brown. (2001) “Equivalence Testing for Binomial Random Variables: Which Test to Use?” The American Statistician, 55(4): 279. 156. Dennis, B. (2001) “Statistics and the Scientific Method in Ecology.” Draft for The Nature of Scientific Evidence, M. L. Taper and S. R. Lele, eds., The University of Chicago Press. 33 pages. 157. Gregoire, T. G. (2001) “Biometry in the 21st Century: Whither Statistical Inference?” Keynote address presented at The Conference on Forest Biometry and Information Science (http://cms1.gre.ac.uk/conferences/iufro/proceedings/), 25-29 June 2001, The University of Greenwich, London, U.K. 158. Hoenig, J. M. and D. M. Heisey. (2001) “The Abuse of Power: The Pervasive Fallacy of Power Calculations for Data Analysis.” The American Statistician, 55(1): 1-6. 159. Lenth, R. V. (2001) “Some Practical Guidelines for Effective Sample Size Determination.” The American Statistician, 55(3): 187-193. 160. Pace, M. L. (2001) “Prediction and the Aquatic Sciences.” Canadian Journal of Fisheries and Aquatic Sciences, 58: 63-72.

11

StatisticalInferenceBiblio.pdf

© 2013 Timothy G. Gregoire, Yale University

161. Salsburg, D. (2001) The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century. A. W. H. Freeman: New York. 162. Schenker, N. and J. F. Gentleman. (2001) “On Judging the Significance of Differences by Examining the Overlap Between Confidence Intervals.” The American Statistician, 55(3): 182-186. 163. Schnute, J. T. and L. J. Richards. (2001) “Use and Abuse of Fishery Models.” Canadian Journal of Fisheries and Aquatic Science, 58: 10-17. 164. Schweder, T. and N. L. Hjort. (2001) “Confidence and Likelihood.” Statistical Research Report, Department of Mathematics, University of Oslo. [ISBN: 82-553-1278-1] 165. Sellke, T., M. J. Bayarri, and J. O. Berger. (2001) “Calibration of p values for testing precise null hypotheses.” The American Statistician 55(1) 62-71. 166. Senn, S. (2001) “Statistical Issues in Bioequivalance.” Statistics in Medicine, 20: 27852799. 167. Sterne, J. A. C. and G. D. Smith. (2001) “Sifting the Evidence—What’s Wrong With Significance Tests?” British Medical Journal, 322: 226-231. 168. Tryon, W.W. (2001) “Evaluating statistical difference, equivalence and indeterminacy using inferential confidence intervals: An integrated alternative method of conducting Null Hypothesis statistical tests”. Psychological Methods 6(4): 371 – 386. 169. Berger, J. O. (2002) “Could Fisher, Jeffreys, and Neyman Have Agreed on Testing?” Paper based on the Fisher lecture, given at the 2001 Joint Statistical Meetings by the author, Duke University. 170. Bird, C.D. (2002) “Confidence intervals for effect sizes in analysis of variance”. Educational and Psychological measurement 62: 197- 226. 171. Blume, J. D. (2002) “Likelihood methods for measuring statistical evidence.” Statistics in Medicine 21: 2563-2599. 172. Farrant, T. (2002) “To p or not to p.” Royal Statistical Society News, 29(10): 21. 173. Goodman, S. N. (2002) “Author’s Reply.” Statistics in Medicine, 21: 2445-2447. 174. Johnson, D. H. (2002) The Role of Hypothesis Testing in Wildlife Science. Journal of Wildlife Management 66(2) 272-276. 175. Knapp, T. R. (2002) “Some reflections on significance testing” Journal of Modern Applied Statistical Methods 1(2) 240-242.

12

StatisticalInferenceBiblio.pdf

© 2013 Timothy G. Gregoire, Yale University

176. Robinson, D. H. and Wainer, H. (2002) On the Past and Future of Null Hypothesis Significance Testing. Journal of Wildlife Management 66(2) 263-271. 177. Senn, S. (2002) “Letter to the Editor: A comment on replication, p-values and evidence.” Statistics in Medicine, 21: 2437-2444. 178. Sterne, J. A. (2002) Teaching hypothesis tests – time for significant change? Statistics in Medicine 21:985-994. 179. Berger, J. O. (2003) Could Fisher, Jeffreys, and Neyman have agreed on testing? Statistical Science 18(1)1-12. 180. Browner, W. S. (2003) “The reliability of P values”. Science, 301, 167-168. 181. Dass, S. C. and J. O. Berger. (2003) “Unified Conditional Frequentist and Bayesian Testing of Composite Hypotheses.” Scandinavian Journal of Statistics, 30: 193-210. 182. Edwards, A.W.F. (2003) “R.A. Fisher—twice Professor of Genetics: London and Cambridge, or ‘A fairly well-known geneticists’.” The Statistician, 52(3): 311-318. 183. Fisher, R. A. (2003) Note on Dr. Berkson’s criticism of tests of significance. International Journal of Epidemiology 32:692. 184. Goodman, S. (2003) “Commentary: the p-value, devalued” International Journal of Epidemiology 32:699-702. 185. Green, P. J. (2003) “Notes on the life and work of R.A. Fisher.” The Statistician, 52(3): 299-301. 186. Gregoire, T. G. (2003) “Why was Fisher so mad with Neyman?” Presentation at Western Mensurationist Meeting, 14 July 2003, Victoria, BC, Canada. 187. Healy, M. J. R. (2003) “R. A. Fisher the statistician.” The Statistician, 52(3): 303-310. 188. Hubbard, R. and M. J. Bayarri. (2003) “Confusion Over Measure of Evidence (p’s) Versus Errors (’s) in Classical Statistical Testing.” The American Statistician, 57(3): 171178. 189. Lele, S. (2003). Various work and correspondence. Includes: Subhash, L. and A. Das. (2003) “Elicited data and incorporation of expert opinion for statistical inference in spatial studies.” Mathematical Geology, 32(4): 465-467. 190. Onwuegbuzie, A.J. and J.R. Levin. (2003) “Without supporting statistical evidence, where would reported measures of substantive importance lead? To no good effect.” Journal of Modern Applied Statistical Methods, 2(1): 133-151.

13

StatisticalInferenceBiblio.pdf

© 2013 Timothy G. Gregoire, Yale University

191. Royall, R. & T-S Tsou. (2003) Interpreting statistical evidence by using imperfect models: robust adjusted likelihood functions. Journal of the Royal Statistical Society B, 65(2) 391404. 192. Senn, S. (2003) “Foreword: A blue plaque for Fisher.” The Statistician, 52(3) 297-298. 193. Smith, G. D. (2003) “Uncertainty and significance” International Journal of Epidemiology, 32:683. 194. Stone, M. (2003) “Commentary: Worthwhile Polemic or Transatlantic storm-in-a-teacup?” International Journal of Epidemiology, 32:693-694. 195. Bickel, D.R. 2004. Degree of differential gene expression: detecting biologically significant expression differences and estimating their magnitudes. Bioinformatics 20(5): 683 – 688 196. Garcia V.L. (2004) Escaping the Bonferroni iron claw in ecological studies. Journal of Ecology 657-663. 197. Stefano, J.D. (2004) “A confidence interval approach to data analysis”. Forest Ecology and Management 187: 173 – 183. 198. Taper, M. L. & Lele, S. R. (2004) The Nature of Scientific Evidence: Statistical, Philosophical, and Empirical Considerations. University of Chicago Press. 199. Christensen, R. (2005) Testing Fisher, Neyman, Pearson, and Bayes. The American Statistician, 59(2)121-126. 200. Stephens, P.A. and Buskirk, S.W. (2005) Information theory and hypothesis testing: a call for pluralism. Journal of Applied Ecology 42: 4-12. 201. Korn, E. L. & B. Freidlin. (2006) The likelihood as statistical evidence in multiple comparisons in clinical trials: no free lunch. Biometrical Journal 3:346-355. 202. Lenhard, J. (2006) Models and Statistical Inference: The Controversy between Fisher and Neyman-Pierson. British Journal Philosophy of Science 57:69-91. 203. Moerkerke, B., Goetghebeur, E., De Riek, J., and Roldan-Ruiz, I. 2006. Significance and impotence: towards a balanced view of the null and the alternative hypotheses in marker selection for plant breeding. J .R. Statist. Soc. 169: 61 - 79 204. Thompson, B. (2006). Critique of p-values. International Statistical Review. 74(1)1-14. 205. Yuan, Y. (2007). Bayesian meta-analysis of highly-cited controlled clinical trials based on test statistics. (IBC meeting in Montreal, 2006??).

14

StatisticalInferenceBiblio.pdf

© 2013 Timothy G. Gregoire, Yale University

206. Boyles, R. A. (2008). The role of likelihood in interval estimation. The American Statistician. 62(10) 22-26. 207. Spanos, A. (2008) “Review of Stephen T. Ziliak and Deirdre N. McCloskey;s The cult of statistical significance: how the standard error costs us jobs, justice, and lives. Amm Arbor (MI): The University of Michigan Press, 2008, XXII+ 322 pp”. Earsmus Journal for Philosophy and Economics 1(1): 154 – 164. 208. Cox, D. R. (2009) Randomization in the Design of Experiments. International Statistical Review 77(3) 415-429. 209. Hurlbert, S. H. (2009) The Ancient Black Art and Trandisciplinary Extent of Pseudoreplication. Journal of Comparative Psychology, 123(4) 434-443. 210. Hurlbert, S.H. and Lombard, C.M. (2009) Final collapse of the Neyman-Pearson decision theoretic framework and rise of the neoFisherian. Annales of Zoologici Fennici 46: 311 – 349. 211. Reichardt, C.S. and Gollob, H.F. (2009) “Justifying the use and increasing the power of a t Test for a randomized experiment with a convenience sample”. Psychological Methods 4: 117 – 128. 212. Breadford, M., Stricland, M. & Keiser, A. (2010) “Statistics and soils” Course notes. 213. Browne, R. H. (2010) The t-test p value and its relationship to the effect size and P(X>Y). The American Statistician 64(1)30-33 (Correction, TAS, 64(2) 195) 214. Micceri, T. (2010) The Unicorn, the normal curve, and other improbably creatures. Psychological Bulletin 105(1) 156-166. 215. Zuo, Y. (2010) “Is the t confidence interval ± t∞ (n - 1)s/√n optimal?”. The American Statistician 64(2): 170 – 173. 216. Boos, D.D. and Stefanski L.A. (2011) p-Value Precision and Reproducibility. The American Statistician 65 (4): 213-221. 217. Hubbard, R. (2011) The widespread misinterpretation of p-values as error probabilities. Journal of Applied Statistics 38 (11): 2617-2626. 218. Kass, R.E. (2011) Statistical Inference: The Big Picture. Institute of Mathematical Statistics 26(1): 1-9. 219. Lewis, F., Butler, A. and Gilbert, L. (2011) A unified approach to model selection using the likelihood ratio test. Methods in Ecology and Evolution 2: 155 – 162.

15

StatisticalInferenceBiblio.pdf

© 2013 Timothy G. Gregoire, Yale University

220. Picquelle, S.J. and Mier, K.L. (2011) A Practice guide to statistical methods for comparing means from two-stage sampling. Fisheries Research 107: 1-13. 221. Wild, C.J., Pfannkuch, M. and Regan, M. (2011) Towards more accessible conceptions of statistical inference. Journal of Royal Statistical Society 174(2): 247 – 295. 222. Friston, K. (2012) “Ten ironic rules for non-statistical reviewers”. NeuroImage 61: 1300 – 1310. 223. Hurlbert, S.H. (2012) Pseudofactorialism, response structures and collective responsibility. Austral Ecology: 1-18. 224. Hurlbert, S.H. and Lombard, C.M. (2012) Lopsided Reasoning on Lopsided Tests and Multiple Comparisons. Australian & New Zealand Journal of Statistics 54(1): 23 – 42. 225. Bacchetti, P. (2013) “Small sample size is not the real problem”. Nature Reviews: Neuroscience 14:585 doi: 10.1038/nrn3475-c3. 226. Button, K. S., Ioannidis, J. P. A., Mokrysz, C., Nosek, B. A. Flint, J., Robinson, E. S. J. & Munafo. (2013) “Power failure: why small sample size undermines the reliability of neuroscience”. Nature Review Neuroscience doi 10:1038/nrm3475. 227. Cox, D. R. (2013) “A return to an old paper: ‘Tests of separate families of hypotheses’” Journal of the Royal Statistical Society, Series B. 75(2) 207 – 215. 228. Hurlbert, S.H. (2013) Affirmation of the classical terminology for experimental design via a critique of Casella’s Statistical design. Agronomy Journal 105(2): 412 – 418. 229. Johnson, V. E. (2013) “Revised standards for statistical evidence”. Proceedings of the National Academies of Science 110(48) 19313 – 19317. doi 10.1073/pnas.1313476110. 230. Lin, W. (2013) “Agnostic notes on regression adjustments to experimental data: Reexamining Freedman’s critique”. The Annals of Applied Statistics 7(1): 295 – 318. 231. Colquhoun, D. (2014) “An investigation of the false discovery rate and the misinterpretation of p-values”. Royal Society Open Science doi. 10.1098/rsos.140216. 232. Foster, C. (2014) “Confidence trick: the interpretation of confidence intervals”. Canadian Journal of Science, Mathematics, and Technology Education 14(10 23 – 34. doi 10.1080/14926156.2014.874615 233. Low-Décarie, E., Chivers, C. & Granados, M. (2014) “Rising complexity and falling explanatory power in ecology”. Frontiers in Ecology and Environment 12(7) 412 – 418.

16