MINITAB: AN OVERVIEW - IASRI

59 downloads 136 Views 202KB Size Report
Minitab has the advanced Design of Experiments. (DOE) .... Example 2.1: In a certain experiment to compare two types of pig foods A and B, the following results ...
MINITAB: AN OVERVIEW Rajender Parsad I.A.S.R.I., Library Avenue, New Delhi – 110 012 [email protected] The functionality of MINITAB is accessible through interactive windows and menus, or through a command language called session commands. There are three windows viz. Data window, Session window and Project Manager. Data window is a worksheet in a spreadsheet format, with rows and columns that intersect to form individual cells. A worksheet can contain up to 4000 columns, 1000 constants, and up to 10,000,000 rows depending on memory of the computer. The text output generated by the analyses is displayed in Session window. The Project Manager contains folders that allow one to navigate, view, and manipulate various parts of the project. Minitab has the advanced Design of Experiments (DOE) capabilities. One can screen the factors to determine which are important for explaining process variation. It can generate two-level full and fractional factorial designs, and Plackett-Burman designs, Box-Behnken and central composite designs, simplex centroid and simplex lattice designs and Taguchi orthogonal array designs. It also allows one to perform one way analysis of variance, two-way analysis of variance for balanced data, test for equality of variances, and generate various plots. Balanced ANOVA models with crossed or nested and fixed or random factors can also be analyzed. The option General MANOVA analyzes balanced or unbalanced MANOVA models with crossed or nested and fixed or random factors. The analysis of covariance is also possible with option General MANOVA. For initiatinfg the work on MINITAB. From thw Windows Taskbar, choose Start → Programs→MINITAB 14 (MINITAB SOLUTIONS) →MINITAB 14 (MINITAB 15). Minitab opens with two main windows viz. Session Window and Data Window. The first screen of MINITAB are shown as

Minitab: An Overview

Under the Data Menu: the following options are available Subset Worksheet - copies specified rows from the active worksheet to the new worksheet Split Worksheet - splits or unstacks the active worksheet into two or more new worksheets based on one or more "By" variables Merge Worksheets - combines two worksheets into one new worksheet Sort - sorts one or more columns of data Rank - assigns rank scores to values in a column Delete Rows - deletes specified rows from columns in the worksheet Erase Variables - erases any combination of columns, stored constants and matrices Copy - copies selections from one position in the worksheet to another; can copy entire selections or a subset Stack - stacks columns on top of each other to make longer columns Unstack - unstacks (or splits) columns into shorter columns Transpose Columns - switches columns to rows Concatenate - combines two or more text columns side by side into one new column Code - recode values in columns Change Data Type - changes columns from one data type (such as numeric, text, or date/time) to another Display Data - displays data from the current worksheet in the Session window Extract from Date/Time to Numeric/Text - extracts one or more parts of a date/time column, such as the year, the quarter, or the hour, and saves that data in a numeric or a text column. In the worksheet, one can enter the data in columns numbered as C1, C2, …. The names of the variables can be written in the row below the row cotaining column numbers C1, C2, … Calc Menu has the following sub-options Calculator - does arithmetic using an algebraic expression, which may contain arithmetic operations, comparison operations, logical operations, and functions Column Statistics - calculates various statistics based on a column you select Row Statistics - calculates various statistics for each row of the columns you select Standardize - centers and scales columns of data Make Patterned Data - provides an easy way to fill a column with numbers or date/time values that follow a pattern. See also Generating Patterned Data Overview for related information. Make Mesh Data - creates a regular (x,y) mesh to use for drawing contour, 3D surface and wireframe plots, with the option to create the z-variable as well

I-180

Minitab: An Overview

Make Indicator Variables - creates indicator (dummy) variables that you can use in regression analysis. See also Generating Patterned Data Overview for related information. Set Base - fixes a starting point for Minitab's random number generator Random Data - displays commands for generating a random sample of numbers, sampled either from columns of the worksheet or from a variety of distributions Probability Distributions - displays commands that allow you to compute probabilities, probability densities, cumulative probabilities, and inverse cumulative probabilities for continuous and discrete distributions Matrices - displays commands for doing matrix operations The main menu for statistical data analysis Stat. Under this option, following suboptions are available: Basic Statistics Regression ANOVA (Analysis of Variance) DOE (Design of Experiments) Control Charts Quality Tools Reliability/Survival Multivariate Time Series Tables Nonparametrics EDA (Exploratory Data Analysis) Power and Sample Size In Basic statistics, following sub-options can be used through selecting Stat > Basic Statistics Select one of the following commands: Display Descriptive Statistics , Store Descriptive Statistics , Graphical Summary, 1-Sample Z, 1-Sample t, 2-Sample t, Paired t, 1 Proportion, 2 Proportions, 1-Sample Poisson Rate, 2-Sample Poisson Rate, 1 Variance, 2 Variances, Correlation, Covariance, Normality Test, Goodness-of-Fit Test for Poisson. Then further subsub options can be used. For performing regression analysis, from the menus choose Stat > Regression and then select one of the following commands to fit a model relating a response to one or more predictors : Regression - does simple, multiple and polynomial regression Stepwise - does stepwise regression, forward selection, and backward elimination Best Subsets - does best subsets regression Fitted Line Plot - fits a simple linear or polynomial regression model and plots the regression line through the actual data or the log10 of the data Partial Least Squares - does partial least squares regression Binary Logistic Regression - does logistic regression for a binary response variable Ordinal Logistic Regression - does logistic regression for an ordinal response variable Nominal Logistic Regression - does logistic regression for a nominal response variable

I-181

Minitab: An Overview

For performing Analysis of variance, Choose: Stat > ANOVA. This option allows to perform analysis of variance, test for equality of variances, and generate various plots. The analysis can be carried out, using the suitable sub-option. One-Way - performs a one-way analysis of variance, with the response in one column, subscripts in another and performs multiple comparisons of means One-Way (Unstacked) - performs a one-way analysis of variance, with each group in a separate column Two-way - performs a two-way analysis of variance for balanced data Analysis of Means - displays an Analysis of Means chart for normal, binomial, or Poisson data Balanced ANOVA - analyzes balanced ANOVA models with crossed or nested and fixed or random factors General Linear Model - analyzes balanced or unbalanced ANOVA models with crossed or nested and fixed or random factors. You can include covariates and perform multiple comparisons of means. Fully Nested ANOVA - analyzes fully nested ANOVA models and estimates variance components Balanced MANOVA - analyzes balanced MANOVA models with crossed or nested and fixed or random factors General MANOVA - analyzes balanced or unbalanced MANOVA models with crossed or nested and fixed or random factors. You can also include covariates. Test for Equal Variances - performs Bartlett's and Levene's tests for equality of variances Interval Plot - produces graphs that show the variation of group means by plotting standard error bars or confidence intervals Main Effects Plot - generates a plot of response main effects Interactions Plot - generates an interaction plots (or matrix of plots) Minitab can also be used for generating the layout of designs for two-level full and fractional factorial designs using Stat > DOE > Factorial. For generating Box-Behnken and central composite designs, use Stat > DOE > Response Surface. Simplex centroid and simplex lattice designs for mixture experiments can be obtained using Stat > DOE> Mixture. Taguchi orthogonal arrays can be generated using Stat > DOE> Taguchi. Minitab can perform principal components analysis, factor analysis, cluster analysis, discriminant analysis, and correspondence analysis. For performing multivariate data analysis, choose: Stat > Multivariate and then any one of the following sub-options depending upon the analysis required to be performed. Principal Components - performs principal components analysis Factor Analysis - performs factor analysis Item Analysis - performs item analysis

I-182

Minitab: An Overview

Cluster Observations - performs agglomerative hierarchical clustering of observations Cluster Variables - performs agglomerative hierarchical clustering of variables Cluster K-Means - performs K-means non-hierarchical clustering of observations Discriminant Analysis - performs linear and quadratic discriminant analysis Simple Correspondence Analysis - performs simple correspondence analysis on a two-way contingency table Multiple Correspondence Analysis - performs multiple correspondence analysis on three or more categorical variables Choosing: Stat > EDA performs exploratory data analysis to explore data before using more traditional methods, or to examine residuals from a model. They are particularly useful for identifying extraordinary observations and noting violations of traditional assumptions such as nonlinearity or nonconstant variance. Following sub-options may be used: Stem-and-Leaf - does a stem-and-leaf plot Boxplot - does a box-and-whiskers plot Letter Values - prints a letter-value display Median Polish - uses median polish to analyze a two-way layout Resistant Line - fits a line to data using a procedure that is resistant to outliers Resistant Smooth - smoothes data (usually a time series) Rootogram - prints a suspended rootogram Minitab may also be used for Control Charts, Quality Tools, Reliability/Survival, Time Series, Tables, Nonparametrics and Power and Sample Size. The other menus in Minitab are: Graph, Editor, Tools, Windows and Help. Once we click on help, we get the following screen.

I-183

Minitab: An Overview

Some practical exercises using MINITAB are given in the sequel. ¾ t-test Example 2.1: In a certain experiment to compare two types of pig foods A and B, the following results of increase in weights were observed in same set of 8 pigs: Food A: 49 53 51 52 47 50 52 53 Food B: 52 55 52 53 50 54 54 53 Can we conclude that food B is better than A? Solution: Paired t-test is to be used here. The data has to be entered in the worksheet of the MINITAB in the following manner in two separate columns C1 and C2: 49 52 53 55 51 52 52 53 47 50 50 54 52 54 53 53 Steps: STAT → BASIC STATISTICS → PAIRED t → Enter C1 in First sample and C2 in second sample → OK

Output: Paired T-Test and CI: C1, C2 Paired T for C1 - C2 N Mean C1 8 50.8750 C2 8 52.8750 Difference 8 -2.00000

St Dev 2.1002 1.5526 1.30931

SE Mean 0.7425 0.5489 0.46291

95% CI for mean difference: (-3.09461, -0.90539) T-Test of mean difference = 0 (vs not = 0): T-Value = -4.32 P-Value = 0.003. ¾ Correlation and Regression

Example 2.2: In diabetic rats the blood sugar and endogenous insulin levels were estimated. Find out if there is correlation between these two parameters Rat No. 1 2 3 4 5 6 7 8 Blood Sugar (x) 156 102 134 184 198 203 123 176 mg% Insulin (y) 16 21 18 11 10 8 20 11 IU Solution: For obtaining the correlation coefficient using MINITAB from the menus choose: Stat → Basic Statistics → Correlation → Select two or more numeric variables → Check the box Display p-values and click button OK. The output of the above example with MINITAB is Pearson correlation of x and y = -0.984 P-Value = 0.000

I-184

Minitab: An Overview

To calculate Spearman's rank correlation coefficient using MINITAB, ensure that there are no missing values in the data. If the data are not ranked, then use Data →Rank and then compute the Pearson's correlation on the columns of ranked data as explained earlier. Don't forget to uncheck Display p-values as the p-value given here is not accurate for Spearman's r. Don’t use p-values to interpret Spearman's r. To obtain the partial correlation using MINITAB: 1 Regress the first variable on the other variables and store the residuals. 2 Regress the second variable on the other variables and store the residuals. 3 Calculate the correlation between the two columns of residuals. Example 2.3: Given the following data, fit a simple linear regression equation between y and x1. Also fit a multiple linear regression equation with y as dependent and x1, x2, x3 and x4 as independent variables. Observation y x1 x2 x3 x4 No. 1 78.5 7 26 6 60 2 74.3 1 29 15 22 3 104.3 11 56 8 20 4 87.6 11 31 8 47 5 95.9 7 52 6 33 6 109.2 11 55 9 22 7 102.7 3 71 17 6 8 72.5 1 31 22 44 9 93.1 2 54 18 22 10 115.9 21 47 4 26 11 83.8 1 40 23 34 12 113.3 11 66 9 12 13 119.4 10 68 8 12 For fitting a regression equation using MINITAB: From the menus choose: Stat→Regression→Select Response Variable→Select one or more independent variables. ¾ Multiple Linear Regression

The output for the above example obtained using MINITAB is Regression Analysis: y versus x1, x2, x3, x4 The regression equation is y = 53.6 + 1.59 x1 + 0.661 x2 + 0.084 x3 - 0.076 x4

Predictor

Coef

SE Coef

T

Constant 53.6300 10.2700 5.22 0.001 x1 1.5887 0.2670 5.95 0.000 x2 0.6606 0.1140 5.79 0.000 x3 0.0845 0.2493 0.34 0.743 x4 -0.0758 0.1144 -0.66 0.526 RMSE (S) = 3.00032 R-Sq = 97.7% R-Sq(adj) = 96.5%

I-185

P

Minitab: An Overview

Source Regression Residual Error Total Source x1 x2 x3 x4

DF 4 8 12 DF 1 1 1 1

Analysis of Variance SS MS 3015.59 753.90 72.02 9.00 3087.61

F 83.75

P 0.000

Seq SS 1546.50 1462.49 2.64 3.96

From the above example, it can be seen that 97.7% of the variation in y is explained by x1, x2, x3 and x4. Coefficients of x1 and x2 are significantly different from zero whereas that of x3 and x4 are not. ¾ ANOVA and ANCOVA

Example 2.4: A trial was designed to evaluate 15 rice varieties grown in soil with a toxic level of iron. The experiment was in a RCB design with three replications. Guard rows of a susceptible check variety were planted on two sides of each experimental plot. Scores for tolerance for iron toxicity were collected from each experimental plot as well as from guard rows. For each experimental plot, the score of susceptible check (averaged over two guard rows) constitutes the value of the covariate for that plot. Data on the tolerance score of each variety (Y variable) and on the score of the corresponding susceptible check (X variable) are shown below: Scores for tolerance for iron toxicity (Y) of 15 rice varieties and those the corresponding guard rows of a susceptible check variety (X) in a RCB trial Variety Number 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.

Replication-I X Y 15 22 16 14 15 24 16 13 17 17 16 14 16 13 16 16 17 14 17 17 16 15 16 15 15 24 15 25 15 24

Replication-II X Y 16 13 15 23 15 24 15 23 17 16 15 23 15 23 17 17 15 23 17 17 15 24 15 23 15 24 15 24 15 25

I-186

Replication-III X Y 16 14 15 23 15 23 15 23 16 16 15 23 16 13 16 16 15 24 15 26 15 25 15 23 16 15 15 23 16 16

Minitab: An Overview

For performing the ANOVA for the above data using MINITAB: First enter the data in the Worksheet of MINITAB in four columns C1: rep; C2: trt; C3: Y and C4: X. Now fFrom menus choose Stat → ANOVA →General Linear Model. In the response variable Box, enter the variable Y, in the model enter trt rep. Specify the terms for comparing means as trt and the method for multiple comparisons. As the interest is in making all possible pairwise treatment comparisons, select Tukey or Bonferroni method. Check the Box TEST for multiple comparison output. If only ANOVA is to be performed, then C4 is not required. The out put obtained is given in the sequel. The usual analysis of variance without using the covariate (X variable) is as follows: Source DF SS Mean Square F (F-calc) p(Pr>F) Treatment 14 265.91 18.99 1.04 0.445 Replication 2 104.04 52.02 2.85 0.075 Error 28 510.62 18.24 Total 44 880.58 R-Square 0.4201 (42.01%)

R-Sq(Adj) 8.88%

s (Root MSE) C.V. 4.2704 21.5436

Y - Mean 19.82222

Least Squares Treatment Means for yield are Treatment Mean SE mean 1 16.33 2.466 2 20.00 2.466 3 23.67 2.466 4 19.67 2.466 5 16.33 2.466 6 20.00 2.466 7 16.33 2.466 8 16.33 2.466 9 20.33 2.466 10 20.00 2.466 11 21.33 2.466 12 20.33 2.466 13 21.00 2.466 14 24.00 2.466 15 21.67 2.466 Neither Bonferroni Simultaneous Tests nor Tukey Simultaneous Tests for making all possible pairwise treatment comparisons resulted into pF) 0.000 0.015 0.084

Y - Mean 19.82222 P 0.000 0.000

It is interesting to note that the use of a covariate has resulted into a considerable reduction in the error mean square and hence the CV has also reduced drastically. This has helped in catching the small differences among the treatment effects as significant. This was not possible when the covariate was not used. The covariance analysis will thus result into a more precise comparison of treatment effects. Least Squares Treatment Means for yield are Treatment Mean SE mean 1 16.87 1.177 2 18.51 1.185 3 20.15 1.229 4 18.18 1.185 5 22.96 1.356 6 18.51 1.185 7 16.87 1.177 8 20.93 1.265 9 20.87 1.177 10 24.60 1.265 11 19.84 1.185 12 18.84 1.185 13 19.51 1.185 14 20.48 1.229 15 20.18 1.185 The probability of significance of pairwise comparisons among the least square estimates of the treatment effects based on Tukey Simultaneous Tests are given below

I-188

Minitab: An Overview

i/j 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

1 2 3 4 5 6 7 8 . 0.9994 . 0.8280 0.9994 . 1.0000 1.0000 0.9959 . 0.0930 0.5359 0.9754 0.4249 . 0.9994 1.0000 0.9994 1.0000 0.5359 . 1.0000 0.9994 0.8280 1.0000 0.093 0.9994 . 0.5536 0.9840 1.0000 0.9551 0.9945 0.9840 0.5536 . 0.5302 0.9789 1.0000 0.9418 0.9958 0.9789 0.5302 1.0000 0.0077 0.0930 0.5359 0.0622 0.9994 0.0930 0.0077 0.6586 0.8890 0.9999 1.0000 0.9992 0.9219 0.9999 0.889 1.0000 0.9959 1.0000 1.0000 1.0000 0.651 1.0000 0.9959 0.9958 0.9504 1.0000 1.0000 0.9999 0.8529 1.0000 0.9504 0.9999 0.7204 0.9959 1.0000 0.9829 0.9917 0.9959 0.7204 1.0000 0.7967 0.9992 1.0000 0.9949 0.9655 0.9992 0.7967 1.0000

9 10 11 12 13 14 15

9 10 11 12 13 14 15 . 0.6780 . 1.0000 0.3659 . 0.9945 0.1363 1.0000 . 0.9999 0.2713 1.0000 1.0000 . 1.0000 0.651 1.0000 0.9994 1.0000 . 1.0000 0.4762 1.0000 0.9999 1.0000 1.0000

.

Treatments 1 and 7 and 7 and 10 are found to be significantly different. ¾ Combined Analysis of Data

For the data in Example 6.2 in Fundamentals of Design of Experiments given in Module 2: Enter the data in Worksheet of MINITAB in 5 columns: C1: Year; C2: Rep; C3: blk; C4: trt; C5: Yield. Here Yr, Rep, Blk and trt represent respectively denote the year, replication, block and treatment. At the first instance, split the worksheet for two years separately. This can be achieved by selecting Data→Split Worksheet → by Variable Yr. Now using the worksheet for Year 1, choose from the menu: STAT→ANOVA →General Linear Model. In the response variable Box, enter the variable yield, in Model enter Rep blk(rep) trt and Click OK. The output obtained is given in the sequel.

I-189

Minitab: An Overview

Source rep blk(rep) trt Error Total

Analysis of Variance for yield: Year 1 (Using Adjusted SS for Tests) DF Seq SS Adj SS Adj MS F P 3 186.046 186.046 62.015 7.53 0.000 24 1408.858 358.943 14.956 1.82 0.019 48 3442.148 3442.148 71.711 8.7 0.000 120 988.707 988.707 8.239 195 6025.758

S = 2.87040 R-Sq = 83.59% R-Sq(adj) = 73.34% Similarly, the analysis of data for second year can be performed, the results obtained are given in the sequel.

Source rep blk(rep) trt Error Total

Analysis of Variance for yield: Year 2 (Using Adjusted SS for Tests) DF Seq SS Adj SS Adj MS F P 3 176.399 176.399 58.800 11.81 0.000 24 1287.011 556.491 23.187 4.66 0.000 48 3353.212 3353.212 69.859 14.03 0.000 120 597.305 597.305 4.978 5413.927 195

S = 2.23104 R-Sq = 88.97% R-Sq(adj) = 82.07% The interpretations are same as given in Example 2 Section 6. Equality of error variance can be tested using F-test. As above, the errors are heterogeneous. Therefore, the data were transformed by dividing each observation with corresponding root mean square error. For this we create a new column of root mean square error in the worksheet and create a new variable = original variable/sqrt(MSE) using CALC→CALCULATOR. In addition to the above steps, select new variable as response variable, enter model as yr rep(yr) blk( rep yr) trt trt*yr, Define Yr in the Subdiaglog Box Random Factors. Now Click on Results and Check on the Display expected mean squares and variance components. The results obtained are General Linear Model: Transformed Variable versus yr, trt, rep, blk Factor yr rep(yr) blk(yr rep)

Type random random random

Levels 2 8 56

trt

fixed

49

Values 1, 2 1, 2, 3, 4, 1, 2, 3, 4 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49

I-190

Minitab: An Overview

Analysis of Variance for transformed variable, using Adjusted SS for Tests Source DF Seq SS Adj SS Adj MS F P yr 1 4911.415 4911.415 4911.415 422.37 0.000 x rep(yr) 6 58.828 58.828 9.805 2.80 0.023 x blk(yr rep) 48 439.557 139.74 2.911 2.58 0.00 trt 48 968.42 968.42 20.175 7.40 0.00 yr*trt 48 130.857 130.857 2.726 2.41 0.00 Error 240 271.335 271.335 1.131 Total 391 6780.412 x Not an exact F-test.

S = 1.06328 R-Sq = 96.00% R-Sq(adj) = 93.48% Expected Mean Squares, using Adjusted SS Source Expected Mean Square for Each Term 1 yr (6) + 4.0000 (5) + 7.0000 (3) + 49.0000 (2) + 196.0000 (1) 2 rep(yr) (6) + 7.0000 (3) + 49.0000 (2) 3 blk(yr rep) (6) + 5.2500 (3) 4 trt (6) + 3.5000 (5) + Q[4] 5 yr*trt (6) + 3.5000 (5) 6 Error (6) Error Terms for Tests, using Adjusted SS Source ErrorDF 1 yr 8.33 2 rep(yr) 39.06 3 blk(yr rep) 240.00 4 trt 48.00 5 yr*trt 240.00

Error MS 11.628 3.505 1.131 2.726 1.131

Synthesis of Error MS (2) + 1.1429 (5) - 1.1429 (6) 1.3333 (3) - 0.3333 (6) (6) (5) (6)

It can easily be seen that the testing of random effects has been one in one step using MINITAB. ¾ Factorial Experiments:

The data given in Example 7.1 can be analyzed using MINITAB: Enter the data in the worksheet of the MINITAB in 6 columns C1: Rep, C2: Block; C3:N; C4: P; C5:K; C6: Yield. Choose: Stat→ANOVA→General Linear Model. Now in the Dialog Box define Yield as Response Variable. In the model define rep block (rep) n k n*p n*k p*k n*p*k. Now Choose comparisons, click the Radio Button of Pairwise Comparisons. In the Terms define n p k n*p n*k p*k n*p*k, Check the boxes of Tukey Method and Test.

I-191

Minitab: An Overview

Analysis of Variance for yield, using Adjusted SS for Tests Source rep blk(rep) n p k n*p n*k p*k n*p*k Error Total

DF 3 8 2 2 1 4 2 2 4 43 71

S = 0.699547

Seq SS 15.7187 14.5571 89.1108 55.9270 3.2173 4.2752 0.7301 0.1128 2.1958 21.0427 206.8876

Adj SS 15.7187 14.1946 89.1108 55.9270 3.2173 4.2752 0.7301 0.1128 2.1958 21.0427

R-Sq = 89.83%

Adj MS 5.2396 1.7743 44.5554 27.9635 3.2173 1.0688 0.3650 0.0564 0.5490 0.4894

F 10.71 3.63 91.05 57.14 6.57 2.18 0.75 0.12 1.12

P 0.000 0.003 0.000 0.000 0.014 0.087 0.480 0.891 0.359

R-Sq(adj) = 83.21%

The probability of significance of pairwise comparisons among levels of N based on Tukey Simultaneous Tests are i/j 40 80 120 40 . 80 0.0000 . 120 0.0000 0.0000 . The probability of significance of pairwise comparisons among levels of P based on Tukey Simultaneous Tests are i/j 0 40 80 0 . 40 0.0000 . 80 0.0000 0.0121 . The probability of significance of pairwise comparisons among levels of K based on Tukey Simultaneous Tests are Difference SE of Adjusted K of Means Difference T-Value P-Value 40 -0 0.4228 0.1649 2.564 0.0139 Similarly the probability of significance of pairwise comparisons among levels of N*P, N*K, P*K and N*P*K based on Tukey Simultaneous Tests can be obtained. ¾ Diagnostics and Remedial Measures

Steps for carrying out these Diagnostics and Remedial Measures using MINITAB First of all fit the model as per the design adopted using the options Stat→ ANOVA→ General Linear Model from the menus and from the Dialog Box Select storage and store residuals in a column in the worksheet. Once the residuals are stored on the worksheet, then use the following steps.

I-192

Minitab: An Overview

Testing Normality From the menus choose: Stat→Basic Statistics→Normality→In the Dialog Box. Select the stored residual as variable in Variable list and then select one of the three tests viz. AndersonDarling, Ryan-Joiner and Kolmogrov-Smirnov tests and Click OK. Test for Homogeneity of Variances From the menus choose: Stat→ANOVA→Test for Equality of Variances→In the Dialog Box. Select the stored residual in the Response Box and Treatment in the Factors Box and then choose the confidence level and Click OK. Transformations of Data For making logarithmic, square root and arcsine transformation, one can use the Calc→Calculator. It is followed by storing the result in a variable by entering a target column in the worksheet. Then define the functions that are to be used for transformation in the Expression SubDialog Box. For logarithmic transformation, define LOGT (Column number or variable name to be transformed) and Click OK. The transformed data will be stored in the target column. For square root transformation, use SQRT (Column number or variable name to be transformed) in the Expression SubDialog Box and for Arcsine transformation, use the expression ASIN (sqrt of the column number in which data is given/100)*180*7/22. The multiplication by 180*7/22 is done to convert the data from radians to degrees. If the original data lies between 0 and 1, then do not divide by 100. Now perform the analysis again and test normality and homogeneity of error terms. If the errors are now normal and homogeneous, perform the analysis on the transformed data, otherwise use an appropriate non-parametric test. For performing the non-parametric analysis, from the menus choose: Stat→Nonparametrics→Appropriate test (Friedman, say)→In the Dialog Box select Response, Treatment and Block variables and Click OK. Example 2.4: Suppose an entomologist is interested in determining whether four different kinds of traps caught equivalent insects when applied to same field. Each of the traps is used six times on the field and resulting data (number of insects per hour) are as shown below alongwith mean, variance and range. Treatment A B C D

I

II

3 9 63 172

1 29 84 118

Replication III IV 12 21 97 109

7 24 61 172

V

VI

Mean Yi

17 28 98 143

2 45 71 168

7 31 79 147

Variance 40.4 138.4 270.8 798.4

Range

S i2 16 36 37 63

From the table it is clear that variances are heterogeneous and variance is proportional to mean. Obtain the residuals for testing the normality and homogeneity of error terms. The residuals obtained are given below:

I-193

Minitab: An Overview

Treatment A B C D

I

II

-1.00 -14.00 -13.00 28.00

0.75 9.75 11.75 -22.25

Normality of error terms: Anderson-Darling Test Statistic p-value (AD) 0.208 0.848

Replication III IV 10.00 0.00 23.00 -33.00

-1.25 -3.25 -19.25 23.75

Mean V

VI

3.25 -4.75 12.25 -10.75

-11.75 12.25 -14.75 14.25

Ryan-Joiner Test Statistic p-value (RJ) 0.992 >0.100

Variance

S i2 0 0 0 0

50.35 94.85 314.85 650.20

Kolmogrov-Smirnov Test Statistic p-value (KS) 0.110 >0.150

The errors were found to be normally distributed. Therefore, homogeneity of error variances was tested using Bartlett's test. Using MINITAB, we get the output as Bartlett's Test (normal distribution) Test statistic = 8.32, p-value = 0.040 Si2 are 5.77, 5.32, 3.43 and 5.43, indicating that variance is proportional to mean. Yi . Therefore, square root transformation should be used. After application of square root transformation, the residuals are Treatment Replication Variance I II III IV V VI S2

The

i

A B C D

-0.03614 -1.34939 -0.28226 1.66779

-0.92542 0.87854 0.78841 -0.74153

1.05800 -0.40473 0.99143 -1.64469

0.20614 -0.12183 -1.08068 0.99637

Normality of error terms on the transformed data: Anderson-Darling Test Ryan-Joiner Test Statistic p-value Statistic p-value (AD) (RJ) 0.391 0.353 0.984 >0.100

0.98287 -0.42993 0.30794 -0.86087

-1.28544 1.42735 -0.72483 0.58293

0.928 0.999 0.694 1.622

Kolmogrov-Smirnov Test Statistic p-value (KS) 0.127 >0.150

The errors remain normally distributed after transformation. The results of homogeneity of error variances using Bartlett's test are Bartlett's Test (normal distribution): Test statistic = 0.89, p-value = 0.828 Hence, we conclude that the errors are normally distributed and have a constant variance after transformation.

I-194

Minitab: An Overview

The results of analysis of variance with original and transformed data are given in the sequel. ANOVA: Original Data Source DF Seq SS Adj. SS Mean Square F (F-calc) p(Pr>F) Replication 5 689.0 689.0 137.8 0.37 0.86 Treatment 3 70828.5 70828.5 23609.5 63.80 0.00 Error 15 5551.0 5551.0 370.1 Total 23 77068.5 R-Square 92.80%

R-Sq(Adj) 88.96%

s (Root MSE) 19.2371

Tukey Simultaneous Tests for All Pairwise Treatment Comparisons 1 2 3 4 1 . 2 0.3525 . 3 0.0001 0.0013 . 4 0.0000 0.0000 0.0001 . ANOVA: Transformed Data Source DF Seq SS Replication 5 5.055 Treatment 3 326.603 Error 15 21.214 Total 23 352.872 R-Square 93.99%

R-Sq(Adj) 90.78%

Adj. SS 5.055 326.603 21.214

Mean Square 1.011 108.868 1.414

F (F-calc) 0.71 76.98

p(Pr>F) 0.622 0.000

s (Root MSE) 1.18922

Tukey Simultaneous Tests for All Pairwise Treatment Comparisons 1 2 3 4 1 . 2 0.0091 3 0.0000 0.0003 4 0.0000 0.0000 0.0015 . With transformed data treatments 1 and 2 are significantly different whereas with original data, they were not.

I-195

Minitab: An Overview

¾ Probit Analysis ¾ Example 1: Finney (1971) gave a data representing the effect of a series of doses of

carotene (an insecticide) when sprayed on Macrosiphoniella sanborni (some obscure insects). The Table below contains the concentration, the number of insects tested at each dose, the proportion dying and the probit transformation (probit+5) of each of the observed proportions. Concentration (mg/1)

No. of insects (n)

No. of affected (r)

%kill (P)

Log Empirical concentration probit (x) 10.2 50 44 88 1.01 6.18 7.7 49 42 86 0.89 6.08 5.1 46 24 52 0.71 5.05 3.8 48 16 33 0.58 4.56 2.6 50 6 12 0.41 3.82 0 49 0 0 Steps for carrying out the Probit Analysis using MINITAB For the data given in example 1, first enter the data in the Worksheet of MINITAB in three coumns C1: dose; C2: total Insects; C3: Insects killed or affected. Now create a column C4 for logdose by using LOGT(C1) using menu Calc. Now Choose Stat > Reliability/Survival > Probit Analysis. From the dialog box; Choose the data format "Success/trial" or "Response/frequency". In the present case, the data is in success trial format, therefore, enter C3, the column containing the number of successes in Number of Successes box and C2, the total number of trials in Number of Trials subbox. In the subbox for stress/stimulus enter C4, the column containing the logdose. Since, there is only one stimulus, therefore, the subbox pertaining to Factor (optional) may be left blank. Choose the distribution as normal. The other options available on the dialog box are: Estimate, Graphs, Options, Results and Storage. Using the option Estimate, One can - estimate percentiles for the percents you specify. These percentiles are added to the default table of percentiles. - estimate survival probabilities for the stress values you specify. One can also change the method of estimation for the confidence intervals and the level of confidence. The default option is two sided 95% fiducial intervals. Other options may also be used, as and when required. For this example, we chose the additional percentiles as 65 and survival probabilities for stress level 0.9 (logdose).

I-196

Minitab: An Overview

Probit Analysis: affect, total versus logdose

Distribution: Normal Response Information Variable Value Count affect Success 132 Failure 111 total Total 243 Estimation Method: Maximum Likelihood Regression Table Standard Variable Coef Error Z P Constant -2.88746 0.350134 -8.25 0.000 logdose 4.21320 0.478303 8.81 0.000 Log-Likelihood = -120.052 Goodness-of-Fit Tests Method Chi-Square DF Pearson 1.72888 3 Deviance 1.73897 3

P 0.631 0.628

Tolerance Distribution: Parameter Estimates Standard 95.0% Normal CI Parameter Estimate Error Lower Upper Mean 0.685338 0.0220962 0.642030 0.728646 StDev 0.237349 0.0269451 0.190001 0.296497 Table of Percentiles

Percent 1 2 3 4 5 6 7 8 9 10 20 30 40 50 60

Percentile 0.133180 0.197882 0.238933 0.269813 0.294933 0.316313 0.335060 0.351845 0.367110 0.381162 0.485580 0.560872 0.625206 0.685338 0.745470

Standard Error 0.0686394 0.0617254 0.0573944 0.0541723 0.0515787 0.0493935 0.0474969 0.0458160 0.0443030 0.0429251 0.0332991 0.0274617 0.0238086 0.0220962 0.0224241

95.0% Normal CI Lower Upper -0.0013503 0.267711 0.0769020 0.318861 0.126442 0.351423 0.163638 0.375989 0.193840 0.396025 0.219504 0.413123 0.241967 0.428152 0.262047 0.441643 0.280278 0.453943 0.297031 0.465294 0.420314 0.550845 0.507048 0.614696 0.578542 0.671870 0.642030 0.728646 0.701519 0.789420

I-197

Minitab: An Overview

65 70 80 90 91 92 93 94 95 96 97 98 99

0.776793 0.809804 0.885096 0.989513 1.00357 1.01883 1.03562 1.05436 1.07574 1.10086 1.13174 1.17279 1.23750

0.0233958 0.0249330 0.0299366 0.0389715 0.0402991 0.0417626 0.0433947 0.0452427 0.0473792 0.0499232 0.0530936 0.0573685 0.0642153

0.730939 0.760936 0.826422 0.913131 0.924581 0.936978 0.950564 0.965688 0.982882 1.00301 1.02768 1.06035 1.11164

0.822648 0.858672 0.943771 1.06590 1.08255 1.10068 1.12067 1.14304 1.16860 1.19871 1.23580 1.28523 1.36336

Table of Survival Probabilities 95.0% Normal CI Stress Probability Lower Upper 0.9 0.182888 0.122757 0.258650 Interpretation: The goodness-of-fit tests (p-values = 0.631, 0.628) suggest that the distribution and the model fits the data adequately. In this case, the fitting is done on normal equivalent deviate only without adding 5. Therefore, log LD50 or lof ED50 corresponds to the value of Probit=0. Log LD50 is obtained as 0.685338. Therefore, the stress level at which the 50% of the insects will be killed is (100.685338=4.845 mg/l). Similarly the stress level at which 65% of the insects will be killed is (100.776793 = 5.981 mg/l). At logdose = 0.9, what percentage of insects will be killed? Results indicate that 18.29% of the insects will be killed. If there are more than one factor used for experimentation, then for the analysis of data follow the same steps as in Example 1 with the addition that in the factor subbox define factor as f.

I-198