Logistic regression: Part 1

1 downloads 0 Views 87KB Size Report
Logistic regression: Part 1. Nikolaos Pandis, Associate Editor of Statistics and Research Design. Bern, Switzerland, and Corfu, Greece. In the article discussing ...
STATISTICS AND RESEARCH DESIGN

Logistic regression: Part 1 Nikolaos Pandis, Associate Editor of Statistics and Research Design Bern, Switzerland, and Corfu, Greece

I

n the article discussing the chi-square test,1 I used a clinical trial scenario with the objective of assessing the clinical alignment efficiency of 2 types of wires. These wires (A and B) were used for 6 months in 2 patient groups, and the outcome recorded was binary: reaching complete alignment (success) or not reaching complete alignment (failure). Table I shows the tabulation of alignment successes and failures for each wire after 6 months of treatment and the calculation of risks and odds of success, and risk and odds ratios of alignment success vs failure. The chi-square test showed no evidence of a difference in the success of alignment after 6 months between the 2 wire groups; the P value was 0.36. The same result can be calculated using a special type of regression analysis called logistic regression used when the outcome is binary (alignment: yes/no). Remember linear regression is used when the outcome is continuous (eg, millimeters of crowding alleviation). In logistic regression, we can get effect estimates, P values, and confidence intervals directly from the regression output. In logistic regressions, the effect estimates are handled as log odds ratios (log OR, log 5 natural logarithm), because they have appropriate mathematical properties (can range from N to 1N). We can convert the log ORs to odds ratios (ORs), which are more interpretable, by exponentiating them (OR 5 exp[log OR]). Logistic regression has a similar form as the linear regression model in the sense that components (y 5 a 1 bx) are linearly related in the logarithmic scale (when using log ORs). However, in logistic regression, the response or dependent variable y is the log odds log(p/ 1p), which is called the logit:   p log 5 a1b  x (equation 1) 1p

where a is the intercept (constant), b is the regression coefficient of x, and x is the categorical predictor, with 2 in our example (wire A or wire B). Specifically, in the above equation that pertains to the logistic regression model, a is the log odds of reaching Department of Orthodontics and Dentofacial Orthopedics, School of Dental Medicine/Medical Faculty, University of Bern, Bern, Switzerland; private practice, Corfu, Greece. Am J Orthod Dentofacial Orthop 2017;151:824-5 0889-5406/$36.00 Ó 2017 by the American Association of Orthodontists. All rights reserved. http://dx.doi.org/10.1016/j.ajodo.2017.01.017

824

alignment in patients in the control group, which we assume here is the group with wire B (reference). In the above equation, b is the log OR of reaching alignment in patients fitted with wire A vs patients fitted with wire B. In a bit more detail, we have groups A and B, and the risk (proportion) of the event for A is p1, whereas the risk of the event for B is p2. The odds of the event would be p1/(1p1) for the wire A group and p2/(1p2) for the wire B group, and their natural logarithms would be log(p1/1p1) 5 logit(p1) and log(p2/ 1p2) 5 logit(p2), respectively. Then the OR of the event in group A compared with group B would be pA=ð1  pAÞ pB=ð1  pBÞ

(equation 2)

and the logarithm of the OR would be       pA=ð1  pAÞ pA pB log 5 log  log pB=ð1  pBÞ 1  pA 1  pB 5 logitðpAÞ  logitðpBÞ

(equation 3)

If we use the values 0 and 1 for wires B and A, respectively, and after appropriate substitutions in equation 1 and using equations 2 and 3, we arrive at the following: For wire B; logðp=1  pÞ 5 a1b  x 5 a1b  0 5 a or logðp=1  pÞ 5 a1b  x 5 a1b  0 5 a or   pB logðp=1  pÞ 5 log and 1  pB for wire A; logðp=1  pÞ 5 a 1 b  x 5 a 1 b  1 5 a 1 b and  pA b 5 logðp=1  pÞ  a 5 log 1  pA     pB pA=ð1  pAÞ  log 5 log 1  pB pB=ð1  pBÞ 

Remember that b is the log OR, our estimate, of reaching alignment in patients fitted with wire A vs patients fitted with wire B; after exponentiation, this gives the OR.

Statistics and research design

825

Table I. Tabulation of alignment success and failure after 6 months of treatment by wire type and calculation of risk and odds of success, and risk and odds ratios of alignment success vs failure Wire type A

B

Total

a 5 23 c58

b 5 19 d 5 11

42 19

31

30

61

Alignment Yes No Total How many aligned with A? Risk 5 23/31 5 0.74 Odds 5 23/8 5 2.88 How many aligned with B? Risk 5 19/30 5 0.63 Odds 5 19/11 5 1.73 Risk ratio 0.77/0.61 5 1.17 Odds ratio (OR) 2.88/1.73 5 1.66

We can easily arrive at an estimate if we substitute from Table I that the     pA=ð1  pAÞ 2:88 log 5 log 5 0:5096688 pB=ð1  pBÞ 1:73 The value 0.5096688 after exponentiation becomes 1.66, which is the OR of reaching alignment in group A vs group B. The calculation of the 95% confidence interval (CI) for this OR is as follows.2 An approximation for the 95% CI for a risk ratio can be calculated as follows: lower CI bound 5 OR/EF upper CI bound 5 OR*EF EF is the error factor, calculated as follows: rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ! 1 1 1 1 1 1 1 EF 5 exp 1:96  a b c b where a, b, c, and d refer to the number of events and nonevents of alignment in wire groups A and B (Table I). Using the example above, we can calculate the 95% CI as follows: rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi! 1 1 1 1 EF 5 exp 1:96  1 1 1 5 2:99 23 19 8 11 lower value: risk ratio/EF 5 1.66/2.99 5 0.56 upper value: risk ratio 3 EF 5 1.66*2.99 5 4.96 We will give a logistic regression example using the same data set that produced the tabulations in Table I.

Table II. Logistic regression output for the effect of

wire type alignment on success after 6 months of treatment Predictor Wire type B A

OR

95% CI

P value

Reference 1.66

0.56, 4.96

0.36

The application of logistic regression relies on the same assumptions that we use to apply univariable and multivariable linear regression. Similar to the multivariable linear regression, the logistic regression gives us the flexibility to include more than 1 predictor (compared with chi-square), both categorical and continuous interaction terms. Additionally, we can obtain the estimates in the form of ORs (or log OR) and their corresponding 95% CIs, and not just a P value. Table II gives the output after fitting a logistic regression model to assess the association between wire type and alignment success. The dependent variable is binary: success or failure of alignment, and the independent variable is the wire type with 2 levels/groups/categories: wire A and wire B. We can see in Table II that we get similar results with the chi-square test, but here we also get automatically the effect estimate (OR), the 95% CI, and the P value. The interpretation of the OR is as follows: the odds of reaching alignment are 1.66 times higher or 66% higher with wire A compared with wire B. It is incorrect to say probability of success because probability is equivalent to risk, but here we are dealing with odds. Refer also to previous articles to refresh your memory on the differences between risk, odds, risk ratio, and odds ratio.3,4 The estimates range from 0.56 to 4.96. Here we are working with ratios; hence, when the 95% CI includes the value of 1, we infer that there is no difference between the wires in terms of alignment success. If the odds of alignment were the same in both wires, such as 0.70 and 0.70, then the OR would be equal to 1. When we are working with differences, the value of zero indicates no difference. For logistic regression, the value of zero is the value of no difference if we are working on the natural log scale. REFERENCES 1. Pandis N. The chi-square test. Am J Orthod Dentofacial Orthop 2016;150:898-9. 2. Kirkwood BR, Sterne JA. Essential medical statistics. 2nd ed. Oxford, United Kingdom: Blackwell; 2003. p. 163-4. 3. Pandis N. The effect size. Am J Orthod Dentofacial Orthop 2012; 142:739-40. 4. Pandis N. Risk ratio vs odds ratio. Am J Orthod Dentofacial Orthop 2012;142:890-1.

American Journal of Orthodontics and Dentofacial Orthopedics

April 2017  Vol 151  Issue 4