Ideal Step Size Estimation for the Multinomial Logistic Regression

Gabriel Ramirez, Paul Rodriguez

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

Resumen

At the core of deep learning optimization problems reside algorithms such as Stochastic Gradient Descent (SGD), which employs a subset of the data per iteration to estimate the gradient in order to minimize a cost function. Adaptive algorithms, based on SGD, are well known for being effective in using gradient information from past iterations, generating momentum or memory that enables a more accurate prediction of the true gradient slope in future iterations, thus accelerating convergence. Nevertheless, these algorithms still need an initial (scalar) learning rate (LR) as well as a LR scheduler. In this work we propose a new SGD algorithm that estimates the initial (scalar) LR via an adaptation of the ideal Cauchy step size for the multinomial logistic regression; furthermore, the LR is recursively updated up to a given number of epochs, after which a decaying LR scheduler is used. The proposed method is assessed for several well-known multiclass classification architectures and favorably compares against other well-tuned (scalar and spatially) adaptive alternatives, including the Adam algorithm.

Idioma originalInglés
Título de la publicación alojadaLASCAS 2024 - 15th IEEE Latin American Symposium on Circuits and Systems, Proceedings
EditorialInstitute of Electrical and Electronics Engineers Inc.
ISBN (versión digital)9798350381221
DOI
EstadoPublicada - 2024
Evento15th IEEE Latin American Symposium on Circuits and Systems, LASCAS 2024 - Punta del Este, Uruguay
Duración: 27 feb. 20241 mar. 2024

Serie de la publicación

NombreLASCAS 2024 - 15th IEEE Latin American Symposium on Circuits and Systems, Proceedings

Conferencia

Conferencia15th IEEE Latin American Symposium on Circuits and Systems, LASCAS 2024
País/TerritorioUruguay
CiudadPunta del Este
Período27/02/241/03/24

Huella

Profundice en los temas de investigación de 'Ideal Step Size Estimation for the Multinomial Logistic Regression'. En conjunto forman una huella única.

Citar esto