Ideal Step Size Estimation for the Multinomial Logistic Regression

Gabriel Ramirez, Paul Rodriguez

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

At the core of deep learning optimization problems reside algorithms such as Stochastic Gradient Descent (SGD), which employs a subset of the data per iteration to estimate the gradient in order to minimize a cost function. Adaptive algorithms, based on SGD, are well known for being effective in using gradient information from past iterations, generating momentum or memory that enables a more accurate prediction of the true gradient slope in future iterations, thus accelerating convergence. Nevertheless, these algorithms still need an initial (scalar) learning rate (LR) as well as a LR scheduler. In this work we propose a new SGD algorithm that estimates the initial (scalar) LR via an adaptation of the ideal Cauchy step size for the multinomial logistic regression; furthermore, the LR is recursively updated up to a given number of epochs, after which a decaying LR scheduler is used. The proposed method is assessed for several well-known multiclass classification architectures and favorably compares against other well-tuned (scalar and spatially) adaptive alternatives, including the Adam algorithm.

Original languageEnglish
Title of host publicationLASCAS 2024 - 15th IEEE Latin American Symposium on Circuits and Systems, Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350381221
DOIs
StatePublished - 2024
Event15th IEEE Latin American Symposium on Circuits and Systems, LASCAS 2024 - Punta del Este, Uruguay
Duration: 27 Feb 20241 Mar 2024

Publication series

NameLASCAS 2024 - 15th IEEE Latin American Symposium on Circuits and Systems, Proceedings

Conference

Conference15th IEEE Latin American Symposium on Circuits and Systems, LASCAS 2024
Country/TerritoryUruguay
CityPunta del Este
Period27/02/241/03/24

Keywords

  • adaptive step size
  • Deep learning
  • multinomial logistic regression
  • stochastic gradient descent

Fingerprint

Dive into the research topics of 'Ideal Step Size Estimation for the Multinomial Logistic Regression'. Together they form a unique fingerprint.

Cite this