Abstract:
We use different optimizers in everyday Machine Learning and Deep Learning applications. The task for any machine learning algorithm is to 𝑚𝑖𝑛 𝑓 (𝑥), where 𝑓 is the objective function and 𝑥 is the input parameter. We can use standard algorithms like gradient descent for simple convex functions. Nowadays, more complex state-of-the-art optimizers like ADAM, ADAMSSD, and DADAM are used. Recent advancements have tackled the situation of finding the minimum for possibly non-convex settings. Recent state-of-the-art optimizers aim to solve the minimization problem for online and distributed settings as well. The aim is to develop an optimizer for distributed / online settings using the control theory analysis. Recent work in control theory doesn’t explore the idea of distributed and online settings.