TY - JOUR

T1 - A robust regression model for bounded count health data

AU - Bayes, Cristian L.

AU - Bazán, Jorge Luis

AU - Valdivieso, Luis

N1 - Publisher Copyright:
© The Author(s) 2024.

PY - 2024

Y1 - 2024

N2 - Bounded count response data arise naturally in health applications. In general, the well-known beta-binomial regression model form the basis for analyzing this data, specially when we have overdispersed data. Little attention, however, has been given to the literature on the possibility of having extreme observations and overdispersed data. We propose in this work an extension of the beta-binomial regression model, named the beta-2-binomial regression model, which provides a rather flexible approach for fitting a regression model with a wide spectrum of bounded count response data sets under the presence of overdispersion, outliers, or excess of extreme observations. This distribution possesses more skewness and kurtosis than the beta-binomial model but preserves the same mean and variance form of the beta-binomial model. Additional properties of the beta-2-binomial distribution are derived including its behavior on the limits of its parametric space. A penalized maximum likelihood approach is considered to estimate parameters of this model and a residual analysis is included to assess departures from model assumptions as well as to detect outlier observations. Simulation studies, considering the robustness to outliers, are presented confirming that the beta-2-binomial regression model is a better robust alternative, in comparison with the binomial and beta-binomial regression models. We also found that the beta-2-binomial regression model outperformed the binomial and beta-binomial regression models in our applications of predicting liver cancer development in mice and the number of inappropriate days a patient spent in a hospital.

AB - Bounded count response data arise naturally in health applications. In general, the well-known beta-binomial regression model form the basis for analyzing this data, specially when we have overdispersed data. Little attention, however, has been given to the literature on the possibility of having extreme observations and overdispersed data. We propose in this work an extension of the beta-binomial regression model, named the beta-2-binomial regression model, which provides a rather flexible approach for fitting a regression model with a wide spectrum of bounded count response data sets under the presence of overdispersion, outliers, or excess of extreme observations. This distribution possesses more skewness and kurtosis than the beta-binomial model but preserves the same mean and variance form of the beta-binomial model. Additional properties of the beta-2-binomial distribution are derived including its behavior on the limits of its parametric space. A penalized maximum likelihood approach is considered to estimate parameters of this model and a residual analysis is included to assess departures from model assumptions as well as to detect outlier observations. Simulation studies, considering the robustness to outliers, are presented confirming that the beta-2-binomial regression model is a better robust alternative, in comparison with the binomial and beta-binomial regression models. We also found that the beta-2-binomial regression model outperformed the binomial and beta-binomial regression models in our applications of predicting liver cancer development in mice and the number of inappropriate days a patient spent in a hospital.

KW - Count data

KW - beta-2-binomial

KW - beta-binomial

KW - generalized additive model for location

KW - penalized maximum-likelihood estimation

KW - regression models

KW - scale and shape

UR - http://www.scopus.com/inward/record.url?scp=85195579320&partnerID=8YFLogxK

U2 - 10.1177/09622802241259178

DO - 10.1177/09622802241259178

M3 - Article

C2 - 38847408

AN - SCOPUS:85195579320

SN - 0962-2802

JO - Statistical Methods in Medical Research

JF - Statistical Methods in Medical Research

ER -