Tip trick
Medium
@karpathy
Importance score: 5 • Posted: March 07, 2026 at 22:15
Score
5
runs great but probably requires some tuning! i'm guessing: WINDOW_PATTERN = "L" is a lot faster (mixed window sizes are only natively supported by FA3) then problem: DEPTH a lot lower, e.g. even 4? DEVICE_BATCH_SIZE can probably go up more then TOTAL_BATCH_SIZE probably a lot lower, e.g. 2**16? needs a bit of tuning to get to a better initial spot (or you can try to let the agent figure it out, but it's not certain it would. could be fun to try!).
Grok reasoning
Karpathy provides specific hyperparameter tuning tips for running autoresearch on macOS, advancing practical LLM training.
Likes
137
Reposts
3
Views
13,883
Tweet ID: 2030406981857759606
Prompt source: ai-influencers-news
Fetched at: March 08, 2026 at 06:03