Analysis of proportions using arcsine transform with any experimental design

Introduction Exact tests on proportions exist for single-group and two-group designs, but no general test on proportions exists that is appropriate for any experimental design involving more than two groups, repeated measures, and/or factorial designs. Method Herein, we extend the analysis of proportions using arcsine transform to any sort of design. The resulting framework, which we have called Analysis of Proportions Using Arcsine Transform (ANOPA), is completely analogous to the analysis of variance for means of continuous data, allowing the examination of interactions, main and simple effects, post-hoc tests, orthogonal contrasts, et cetera. Result We illustrate the method with a few examples (single-factor design, two-factor design, within-subject design, and mixed design) and explore type I error rates with Monte Carlo simulations. We also examine power computation and confidence intervals for proportions. Discussion ANOPA is a complete series of analyses for proportions, applicable to any design.

applied to proportions.
First, the type-I error rate is more liberal than the nominal α, reaching type-I error rates as high as 8% in many scenarios (where it never exceeded 6.4% for the uncorrected ANOPA and never exceeded 5.4% for the corrected ANOPA). To assess these type-I error rates, we redid the first simulation described in Appendix A using a logistic regression analyses instead of an ANOPA. The only difference is that only 20,000 simulated data sets per point are tested; the logistic regression, being an iterative optimization problem, takes far more time to perform than the ANOPA. See the Figure to exceed 250 before the type-I error rate returns to the nominal α. When the true population proportion is low (0.1), samples have to be in the thousands before type-I error rate returns to the nominal α. This undesirable behavior is not unexpected. The primary reason is that logistic regression uses the logit transformation rather than the Anscombe transformation. This transformation is not variance-stabilizing (as documented in Laurencelle, 2021b). Actually, the variance of the logit-transformed scores is reciprocal of the proportions, exacerbating the homogeneity of variances assumptions used in regression models. Also, the test is based on Wilks (1938)'s likelihood ratio test which is a slowly asymptotic test (t tests and F tests refines Wilks tests to small samples).

ANOPA: Appendix B 3
We also believe that logistic regression lacks statistical power in some situations.
Consider as an example the second illustration of the main text. A logistic regression finds no significant coefficients. The coefficients estimated by the analysis are given in Table B1. Twice the log-likelihood index of fit is 429.22. If we constraint the model to no interaction terms, twice the log-likelihood index of fit increases by 4.57 which is a nonsignificant increase according to a likelihood ratio test (Wilks, 1938). This indicates that the data have no detectable differences when a logistic regression approach is used where -by contrast-ANOPA sees a main effect (see the main text).
Insert Table B1 about here Another reason for this reduced statistical power is also easy to identify: The coefficients are not orthogonal because a non-linear transformation is used (using structural equation modeling, covariances between the coefficients could be assessed).
These artificially-induced inter-correlations between the coefficients may create spurious effects through multicolinearity, something totally un-addressed when logistic regressions are used to analyze proportions. In ANOPA, the decomposition of the total effect into additive components guarantees its orthogonality.
Second, the logistic regression does not test effects (main effects, interactions, etc.), it tests coefficients one by one. As seen in Table B1, there are 6 coefficients in a 2 × 3 design (a saturated model). The intercept is the proportion in the reference condition (here, Early diagnostic + Low SES). All the other observed scores are predicted from various combinations of the coefficients. In Table B1, the two interactions coefficients ANOPA: Appendix B 4 are close to significant. Yet, they say nothing regarding the presence or absence of an interaction effect which is the true sorts of question an analyst has.
Third, the logistic regression is sensitive to how the factors are coded. One level for each factor must be a level zero condition; this cell serve as a baseline for all the other estimates. However, if you change which level is the level zero cell, you get completely different estimates. Consider a dataset similar to the second illustration but where two cell scores are lowered to have significant effects: The resulting data are seen in Table   folder LogisticRegression.
Insert Table B2 about here When SES Low is coded as zero (and SES-Middle as 1), the logistic regression returns two interaction coefficients that are significantly different from zero. However, when SES middle is coded as zero (and SES low as 1), then the main effect of Moment of diagnostic becomes significant (along with the second interaction). The two analyses are summarized in Table B3.
Insert Table B3 about here Seeing a main effect coefficient appear or disappear based on coding is unexpected to ANOVA users (but is expected by the fact that logistic regression does not test effects, as said above). The lower part of Table B3 shows the ANOPA results with either coding; as seen, they are totally identical, indicating a pregnant interaction effect.
The data and analyses for this section are available on the OSF site, https://osf.io/gja9h/ folder LogisticRegression. ANOPA: Appendix B 5 By contrasts, ANOPA is testing effects (not coefficients); is unaffected by coding; finally, the effects are orthogonal, as demonstrated by their additivity so that spurious effects, multicolinearity and consequently reduced statistical power are impossible with ANOPA.
One additional demonstration of the additivity of ANOPA is that the interaction seen on the middle part of Table B3 can be decomposed into simple effects as is commonly done in ANOVA analyses. The results are seen in the bottom part of Table   B3. The two simple effects' sums of squares (0.0444 and 0.1289) totalize 0.1333, which is indeed a valid decomposition of the interaction and the Moment of Diagnostic effects (whose sum of squares, 0.0442 and 0.0890, do totalize 0.1332; the difference being caused by rounding errors). We clearly see that SES has an effect only for late diagnostic participants. This result is not apparent for logistic regression users.
Some more advantages are a) that the ANOPA can be used when the rate of success in one or some cells is 0% or 100%; whereas the logit transformed is undefined in these situations; b) the ANOPA lead to an easy-to-compute effect size which can be used in a-priori power studies; we fail to see how such an accomplishment could be done with logistic regression. Finally, c) the ANOPA lead to simple expressions for standard error and confidence intervals for easily plotting the results, something impossible to achieve with logistic regression which is focused on coefficients, not on scores.
With all the logistic regression shortcomings reviewed above, and all the advantages of ANOPA, it is difficult to maintain that logistic regression is a superior technique to ANOPA.