Assessment of a Two-step Integration Method as an Optimizer for Deep Learning

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

Resumen

It is a known fact that accelerated (non-stochastic) optimization methods can be understood as multi-step integration ones: e.g. the heavy ball's Polyak and Nesterov accelerations can be derived as particular instances of a two-step integration method applied to the gradient flow. However, in the stochastic context, to the best of our knowledge, multi-step integration methods have not been exploited as such, only as some particular instances, i.e. SGD (stochastic gradient descent) with momentum or with the Nesterov acceleration. In this paper we propose to directly use a two-step (TS) integration method in the stochastic context. Furthermore, we assess the computational effectiveness of selecting the TS method's weights after considering its lattice representation. Our experiments includes several well-known multiclass classification architectures (AlexNet, VGG16 and EfficientNetV2) as well as several established stochastic optimizer e.g. SGD along with momentum/Nesterov acceleration and ADAM. The TS based method attains a better test accuracy than the first two, whereas it is competitive with to a well-tuned (E/learning rate) ADAM.

Idioma originalInglés
Título de la publicación alojada31st European Signal Processing Conference, EUSIPCO 2023 - Proceedings
EditorialEuropean Signal Processing Conference, EUSIPCO
Páginas1245-1249
Número de páginas5
ISBN (versión digital)9789464593600
DOI
EstadoPublicada - 2023
Evento31st European Signal Processing Conference, EUSIPCO 2023 - Helsinki, Finlandia
Duración: 4 set. 20238 set. 2023

Serie de la publicación

NombreEuropean Signal Processing Conference
ISSN (versión impresa)2219-5491

Conferencia

Conferencia31st European Signal Processing Conference, EUSIPCO 2023
País/TerritorioFinlandia
CiudadHelsinki
Período4/09/238/09/23

Huella

Profundice en los temas de investigación de 'Assessment of a Two-step Integration Method as an Optimizer for Deep Learning'. En conjunto forman una huella única.

Citar esto