TY - JOUR
T1 - A robust regression model for bounded count health data
AU - Bayes, Cristian L.
AU - Bazán, Jorge Luis
AU - Valdivieso, Luis
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/8
Y1 - 2024/8
N2 - Bounded count response data arise naturally in health applications. In general, the well-known beta-binomial regression model form the basis for analyzing this data, specially when we have overdispersed data. Little attention, however, has been given to the literature on the possibility of having extreme observations and overdispersed data. We propose in this work an extension of the beta-binomial regression model, named the beta-2-binomial regression model, which provides a rather flexible approach for fitting a regression model with a wide spectrum of bounded count response data sets under the presence of overdispersion, outliers, or excess of extreme observations. This distribution possesses more skewness and kurtosis than the beta-binomial model but preserves the same mean and variance form of the beta-binomial model. Additional properties of the beta-2-binomial distribution are derived including its behavior on the limits of its parametric space. A penalized maximum likelihood approach is considered to estimate parameters of this model and a residual analysis is included to assess departures from model assumptions as well as to detect outlier observations. Simulation studies, considering the robustness to outliers, are presented confirming that the beta-2-binomial regression model is a better robust alternative, in comparison with the binomial and beta-binomial regression models. We also found that the beta-2-binomial regression model outperformed the binomial and beta-binomial regression models in our applications of predicting liver cancer development in mice and the number of inappropriate days a patient spent in a hospital.
AB - Bounded count response data arise naturally in health applications. In general, the well-known beta-binomial regression model form the basis for analyzing this data, specially when we have overdispersed data. Little attention, however, has been given to the literature on the possibility of having extreme observations and overdispersed data. We propose in this work an extension of the beta-binomial regression model, named the beta-2-binomial regression model, which provides a rather flexible approach for fitting a regression model with a wide spectrum of bounded count response data sets under the presence of overdispersion, outliers, or excess of extreme observations. This distribution possesses more skewness and kurtosis than the beta-binomial model but preserves the same mean and variance form of the beta-binomial model. Additional properties of the beta-2-binomial distribution are derived including its behavior on the limits of its parametric space. A penalized maximum likelihood approach is considered to estimate parameters of this model and a residual analysis is included to assess departures from model assumptions as well as to detect outlier observations. Simulation studies, considering the robustness to outliers, are presented confirming that the beta-2-binomial regression model is a better robust alternative, in comparison with the binomial and beta-binomial regression models. We also found that the beta-2-binomial regression model outperformed the binomial and beta-binomial regression models in our applications of predicting liver cancer development in mice and the number of inappropriate days a patient spent in a hospital.
KW - Count data
KW - beta-2-binomial
KW - beta-binomial
KW - generalized additive model for location
KW - penalized maximum-likelihood estimation
KW - regression models
KW - scale and shape
UR - http://www.scopus.com/inward/record.url?scp=85195579320&partnerID=8YFLogxK
U2 - 10.1177/09622802241259178
DO - 10.1177/09622802241259178
M3 - Article
C2 - 38847408
AN - SCOPUS:85195579320
SN - 0962-2802
VL - 33
SP - 1392
EP - 1411
JO - Statistical Methods in Medical Research
JF - Statistical Methods in Medical Research
IS - 8
ER -