2022-M11

It's been three months since I decided to pivot to AI safety. Who knew three months can be such a long time.

FTX

I got an FTX future fund regrant for six months to help make the switch. Then FTX imploded, and it turns out my grant may be clawed back during the bankruptcy proceedings. Unfortunate. On the bright side¹, FTX seems to have been such a mess that it could take years for the process to get to me. So if you have short enough timelines…²

Courses

The ARENA virtual program is going along smoothly.
This month, I joined SERI MATS to get my hands dirty in research. I'm officially under Evan Hubinger's Deceptive AI stream (though I'm also participating in John Wentworth's workshops). Yes, I'm reaching the limits of what I can juggle. We're also working our way through the Alignment 201 curriculum.³
When ARENA finishes up, I'm going to dedicate more of my attention to metauni (especially their track on singular learning theory (SLT)).
I've also finished most of the miscellaneous Alignment Forum sequences I wanted to go through (as well as AXRP and The Inside View).

The SERI MATS research sprint is about to start, and I'll be in Berkeley from January to at least February to work on this in person. Safe to say, I have way too many ideas for research projects, but I'm planning to focus on Toy Models of Superposition.

Distillation

I'm working on an introduction to SLT that should be out soon.
The video I'm working on with Hoog on AI risk is a little delayed because OpenPhil funding was paused (and I'm a little overextended but don't want to admit it), but it is coming along.
Next to all that, I've started working on an online, interactive AI safety textbook (very much a work in progress, more coming soon)⁴.

Textbooks

Mathematics for Machine Learnings: Done. Great book. Highly recommend.
Pattern Recognition and Machine Learning: Completed up through chapter 9. I have 5 chapters to go. Instead of trying to bang these out in the next month to meet my original deadline, I'm going to push back my deadline by two weeks, so I have more time during the research sprint.
Cracking the Coding Interview: 6 more chapters to go. Like PRML, I'm going to push back my original deadline 2 weeks to mid-January.⁴
Reinforcement Learning: I've already gone through 6/17 chapters ahead of schedule. I'm aiming to be done by April.
Artificial Intelligence: A Modern Approach: Here too, I'm 3 chapters in. Ahead of when I originally planned to get this started. This is a big book, so my (self-enforced) deadline is May 1st.

Outreach

I'm writing this on a plane to the EAGxBerkeley. Judging from EAGxRotterdam, EAGxBerkeley is going to be great (as long as they leave out the food poisoning part).
I've started laying out some feelers for a longer term project of establishing an AI-safety-oriented company in mainland Europe. There's a lot of early-career interest among very smart students. Give them a few years, and mainland Europe will be ripe for a new organization like 3 Resources/Rationalia/EA/Topics/Anthropic, Redwood, or Conjecture.

Conclusion

The next month is going to be hectic. I'll be in the Bay area for a week and a half, then my parents' place in NY state for a week, then Michigan at my ~~girlfriend's~~ fiancée's dad's, then NY for another week. Oh yeah, did I mention? I got engaged!⁵ On the day I publish this, it's our 5-year anniversary. Robin, I love you, and I can't wait to spend the rest of our lives together. (However long that is, doomer.)

For me. I think it's safe to say the inconvenience for me is less than the inconvenience for people who lost their life savings. ↩
My timelines aren't actually that short. But I'm not worried about eventually being able to pay this back (even very soon with the SERI MATS stipend). ↩
I've come to conclude that I can safely skip Intro to ML Safety for now. Much of the content overlaps with these other programs/textbooks. ↩
I'm probably a bit newer to the field than would be ideal for this task, so I'm hoping to migrate to a more editorial role, delegating the bits that I can. I think my main strength here is more a kind of Olahian interactive distillation. That's an ability which seems to be pretty rare among active researchers. ↩ ↩²
I proposed with an Oura ring, which definitely says something about the kind of people we are. Now that I think about it, I should have probably asked for a sponsorship and gotten the whole wedding funded by a late-stage capitalism PR departments, but hey hindsight is 20/20. ↩

2022-M11

Footnotes