Reading
books and papers, current and past
currently reading (2)
- Scaling Laws for Neural Language Models Re-reading carefully alongside the Chinchilla paper.
- Software Engineering at Google Slow read. Skipping the chapters that aren't relevant; lingering on the ones that are.
finished (2)
- The Unreasonable Effectiveness of Data Old but somehow still right. Worth a re-read every couple of years.
- Training Compute-Optimal Large Language Models (Chinchilla) Paired well with Kaplan. Worth memorizing the loss-vs-compute curves.
queued (1)
- Deep Learning with PyTorch Lined up after Chinchilla.