This site uses cookies to deliver our services and to ensure you get the best experience. By continuing to use this site, you consent to our use of cookies and acknowledge that you have read and understand our Privacy Policy, Cookie Policy, and Terms
We show that the optimal complexity of Nesterov's smooth first-order optimization algorithm is preserved when the gradient is only computed up to a small, uniformly bounded er...
Neural networks use neurons of the same type in each layer but such architecture cannot lead to data models of optimal complexity and accuracy. Networks with architectures (number ...