Mixing ADAM and SGD: a Combined Optimization Method

Optimization methods (optimizers) get special attention for the efficient training of neural networks in the field of deep learning. In literature there are many papers that compare neural models trained with the use of different optimizers...

Each paper demonstrates that for a particular problem an optimizer is better than the others but as the problem changes this type of result is no longer valid and we have to start from scratch. In our paper we propose to use the combination of two very different optimizers but when used simultaneously they can overcome the performances of the

To finish reading, please visit source site