Assessment of a Two-step Integration Method as an Optimizer for Deep Learning

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

It is a known fact that accelerated (non-stochastic) optimization methods can be understood as multi-step integration ones: e.g. the heavy ball's Polyak and Nesterov accelerations can be derived as particular instances of a two-step integration method applied to the gradient flow. However, in the stochastic context, to the best of our knowledge, multi-step integration methods have not been exploited as such, only as some particular instances, i.e. SGD (stochastic gradient descent) with momentum or with the Nesterov acceleration. In this paper we propose to directly use a two-step (TS) integration method in the stochastic context. Furthermore, we assess the computational effectiveness of selecting the TS method's weights after considering its lattice representation. Our experiments includes several well-known multiclass classification architectures (AlexNet, VGG16 and EfficientNetV2) as well as several established stochastic optimizer e.g. SGD along with momentum/Nesterov acceleration and ADAM. The TS based method attains a better test accuracy than the first two, whereas it is competitive with to a well-tuned (E/learning rate) ADAM.

Original languageEnglish
Title of host publication31st European Signal Processing Conference, EUSIPCO 2023 - Proceedings
PublisherEuropean Signal Processing Conference, EUSIPCO
Pages1245-1249
Number of pages5
ISBN (Electronic)9789464593600
DOIs
StatePublished - 2023
Event31st European Signal Processing Conference, EUSIPCO 2023 - Helsinki, Finland
Duration: 4 Sep 20238 Sep 2023

Publication series

NameEuropean Signal Processing Conference
ISSN (Print)2219-5491

Conference

Conference31st European Signal Processing Conference, EUSIPCO 2023
Country/TerritoryFinland
CityHelsinki
Period4/09/238/09/23

Keywords

  • gradient flow
  • stochastic gradient descent

Fingerprint

Dive into the research topics of 'Assessment of a Two-step Integration Method as an Optimizer for Deep Learning'. Together they form a unique fingerprint.

Cite this