A lot has changed for me in the past month. My partner and I decided to close the business we had started together, and I've thrown myself full-force at AI safety.
We weren't seeing the traction we needed, I was nearing the edge of burnout (web development is not the thing for me1), and, at the end of the day, I did not care enough about our users. It's hard to stay motivated to help a few patients today when you think there's a considerable risk that the world might end tomorrow. And I think the world might end soon — not tomorrow, but more likely than not in the next few decades.2 At some point, I reached a point where I could no longer look away, and I had to do something.
So I reached out to the 80,000 hours team, who connected me to people studying AI safety in my area, and helped me apply to the FTX Future Fund Regranting Program for a six-month upskilling grant to receive $25,000 for kickstarting my transition to AI.
Now, I'm not a novice (my Bachelors and Masters theses applied techniques from statistical physics to understand neural networks), but I could definitely use the time to refresh & catch up on the latest techniques. A year is a long time in AI.
Next to "upskilling" in ML proper, I need the time to dive deep into AI safety: there's overlap with the conventional ML literature, but there's also a lot of unfamiliar material.
Finally, I need time to brush up my CV and prepare to apply to AI labs and research groups. My current guess is that I'll be best-suited to empirical/interpretability research, which I think is likely to be compute-constrained. Thus, working at a larger lab is crucial. That's not to mention the benefits of working alongside people smarter than you are. Unfortunately (for me), the field is competitive, and a "gap year" in an unrelated field after your masters is likely to be perceived as a weakness. There's a signaling game at hand, and it's play or be played. To sum, spending time on intangibles like "networking" and tangibles like "publications"3 will be a must.
To keep myself focused throughout the next half year, I'll be keeping track of my goals and progress here. To start, let's take a look at my current plan for the next half year.
Like all good plans, this plan consists of three parts:
- Mathematics/Theory of ML
- Implementation/Practice of ML
- AI Safety
There's also an overarching theme of "community-building" (i.e., attending EAGs and other events in the space) and of "publishing".
- Mathematics for Machine Learning by Deisenroth, Faisal, and Ong (2020).
- I was told that this book is predominantly important for its first half, but I'm ready to consume it in full.
- Pattern Recognition and Machine Learning by Bishop (2006)
- I was advised to focus on chapter 1-5 and 9, but I'm aiming to at least skim the entirety.
- Cracking the Coding Interview by McDowell (2015)
- One specification I'm going to have to game is the interview. I'm also taking this as an opportunity to master Rust, as I think having a solid understanding of low-level systems programming is going to be an important enabler when working with large models.
- Practical Deep Learning for Coders by Fast AI
- This comes with an accompanying book
- Spinning Up by Open AI
There are a bunch more, but these are the only ones I'm currently committing to finishing. The rest can serve as supplementary material after.
AI Safety Courses
- Podcasts: AXRP, Inside View, 80,000 hours
- Articles: AI Safety Papers, <Insert important ML papers here>, <Insert Alignment Forum best of here>
- AI Alignment Newsletter, Robert Miles Youtube channel4
I'm not particularly concerned about publishing to prestigious journals, but getting content out there will definitely help. Most immediately, I'm aiming to convert / upgrade my Masters thesis to an AI Safety/Interpretability audience. I'm intrigued by the possibility that perspectives like the Lyapunov spectrum can help us enforce constraints like "forgetfulness" (which may be a stronger condition than myopia), analyze the path-dependence of training, and detect sensitivity to adversarial attacks / improbable inputs, that random matrix theory might offer novel ways to analyze the dynamics of training, and, more generally, that statistical physics is an un(der)tapped source of interpretability insight.
In some of these cases, I think it's likely that I can come to original results within the next half year. I'm going to avoid overcommitting to any particular direction just yet, as I'm sure my questions will get sharper with my depth in the field.
Next to this, I'm reaching out to several researchers in the field and offering myself up as a research monkey. I trust that insiders will have better ideas than I can form as of yet, but not enough resources to execute (in particular, I'm talking about PhD students), and that if I make myself useful, karma will follow.
Over the next three months, my priority is input — to complete the textbooks and courses mentioned above (which means taking notes, making flashcards, doing exercises). Over the subsequent three months, my priority is output — to publish & apply.
Of course, this is simplifying; research is a continuous process: I'll start to produce output before the next three months is up & I'll continue to absorb lots of input when the three months is up. Still, heuristics are useful.
I'll be checking in here on a monthly basis — reviewing my progress over the previous month & updating my goals for the next month. Let's get the show off the road.
Month 1 (October)
- Finish part 1 of MML
- Finish chapters 1-5 of PRML
- Finish intro & chapters 1-5 of Cracking the Coding Interview
- Finish lessons 1-5 of Practical Deep Learning for Coders
- Finish weeks 1-5 of AGI Safety Fundamentals
- ██ ██████████ ████████ ███ ████████ ████████ ██████
At least not as a full-time occupation. I like creating things, but I also like actually using my brain, and too much of web development is mindless twiddling (even post-Copilot). ↩
More on why I think this soon. ↩
Whether in formal journals or informal blogs. ↩
I'm including less formal / "easier" sources because I need some fallback fodder (for when my brain can no longer handle the harder stuff) that isn't Twitter or Hacker News. ↩