Reading — Jeet Ganatra

currently reading (2)

Scaling Laws for Neural Language Models — Kaplan et al. Re-reading carefully alongside the Chinchilla paper.
Software Engineering at Google — Titus Winters et al. Slow read. Skipping the chapters that aren't relevant; lingering on the ones that are.

finished (2)

The Unreasonable Effectiveness of Data — Halevy, Norvig, Pereira Old but somehow still right. Worth a re-read every couple of years.
Training Compute-Optimal Large Language Models (Chinchilla) — Hoffmann et al. Paired well with Kaplan. Worth memorizing the loss-vs-compute curves.

queued (1)

Deep Learning with PyTorch — Eli Stevens, Luca Antiga, Thomas Viehmann Lined up after Chinchilla.