Ahad, 29 Januari 2023

Show HN: Train CIFAR10 to 94% in under 10 seconds on a single A100, world record https://ift.tt/yItpRvU

Show HN: Train CIFAR10 to 94% in under 10 seconds on a single A100, world record Hi, My career is currently in this field, and I created this project as (effectively, among other things) a living resume, and to also be a really great workbench for hacking/experimenting on different methods. Testing and getting a feel for how different methods work within this framework is truly a delight, and quite simple/fast. Additionally, generally speaking, many of the mathematical concepts should transfer, so this (for me) has been a really great proving grounds in testing out how something might work in a different place in the real world. We hope to get under 2 seconds of training time (for 94%) within about two years or so, so stay tuned for updates as we continue to push more changes that take us faster and faster than our starting point of ~18.1 seconds or so. By the way, this architecture and training hyperparameters do indeed scale well, just increase epochs from 10->80 and base_depth from 64->128 and you'll have about 95.77% accuracy in about 188 seconds or so (just over 3 minutes :D). That alone is a huge boon! Great to see scaling laws working well within this very, very tight hyperparameter resolution. Feel free to let me know if you have any questions, Hacker News always seems to get me the most traffic. I really love talking about this project, and can't really seem to find anyone to nerd out about it with. This is very, very cool stuff! So feel free to leave a comment, and I'd love to jump in and chat about it! :D :) <3 <3 :)))) https://github.com/tysam-code/hlb-CIFAR10 January 30, 2023 at 07:28AM

Tiada ulasan:

Show HN: List of Clojure-Like projects https://ift.tt/1PqyRdW

Show HN: List of Clojure-Like projects https://ift.tt/0tLoXIR August 2, 2025 at 01:47AM