Challenge
Implement a performant optimizer that accelerates training while meeting rigorous test-loss thresholds.
The recent surge in Artificial Intelligence (AI) has been largely driven by deep learning—an approach made possible by vast datasets, highly parallel computing power, and neural network frameworks that support automatic differentiation. Neural networks serve as the fundamental building blocks for these complex AI systems.
The training process of a neural network is often incredibly resource-intensive, requiring massive amounts of data and computational power, which translates to significant financial cost. At the heart of this training process lie optimization algorithms based on gradient descent. These algorithms systematically adjust the network's parameters to minimize a "loss function," effectively teaching the model to perform its designated task accurately.
Even small gains in optimisation efficiency translate into shorter training times, lower energy usage, and significant cost savings, underscoring the value of continued research into better optimizers.
TIG’s neural network optimizer challenge asks innovators to implement an optimizer that plugs into a fixed CUDA-based training framework and trains a multi-layer perceptron (MLP) on a synthetic regression task. The goal is to obtain a low test loss.
Neural networks now underpin everything from chatbots to self-driving cars, and their training efficiency dictates cost, speed, and energy use. Since nearly all of that training hinges on gradient-descent methods, even small optimizer improvements ripple across AI—and into other fields.
Below are some of the highest-impact domains where faster, more reliable training already yields real-world gains: